A Note on Hilbertian Elliptically Contoured Distributions
|
|
- Eugenia Hope Reeves
- 5 years ago
- Views:
Transcription
1 A Note on Hilbertian Elliptically Contoured Distributions Yehua Li Department of Statistics, University of Georgia, Athens, GA 30602, USA Abstract. In this paper, we discuss elliptically contoured distribution for random variables defined on a separable Hilbert space. It is a generalization of the multivariate elliptically contoured distribution to distributions on infinite dimensional spaces. Some theoretical properties of the Hilbertian elliptically contoured distribution are discussed, examples on functional data are investigated to illustrate the applications of such distributions. Keywords: Elliptically contoured distribution, functional data, Hilbertian random variable. 1 Introduction Elliptically contoured distribution is an important class of distribution in Multivariate Analysis, with some very nice symmetry properties. It is widely used in statistical practices, for example in dimension reduction (Li, 1991, Cook and Weisberg 1991, Schott, 1994) and regression graphs (Cook, 1998). The most important example of this class of distribution is of course the multivariate Gaussian distribution. Properties of multivariate elliptically contoured distribution are well studied, see for example Cambanis, Huang, and Simons (1981), Eaton (1986). Recent developments in statistics has lead us to look beyond random vectors on the Euclidian space. Statistical models for random vectors defined on infinite dimensional Hilbert spaces are in demand. One important example is the functional data analysis (Ramsay and Silverman, 2005), where the data are vectors in a functional space, e.g. the L 2 space. Among Hilbertian distributions, the Gaussian distribution is still the most wellunderstood. For example, in functional data analysis, the random functions are 1 addresses: yehuali@uga.edu
2 usually modeled as a Gaussian processes. The class of elliptically contoured distributions is an important generalization from Gaussian. It has important applications in dimension reduction, see recent literature on functional sliced inverse regression (Ferré and Yao, 2002, Li and Hsing, 2007), yet its theoretical properties are not well studied. The goal of this paper is to fill in this gap. The rest of the paper is organized as the following. We introduce some backgrounds and definitions regarding linear operators and Hilbertian random variables in Section 2. A random representation for Hilbertian elliptically contoured random variables is introduced in Section 3. Some theoretical properties of Hilbertian elliptically contoured distribution are discussed in Section 4, including marginal and conditional distributions of a random variable X when it is mapped into different Hilbert spaces. Finally, we give some examples in Section 5 to illustrate the applications of the theories derived in the previous sections, especially their application in functional data analysis. 2 Definitions and Backgrounds 2.1 Linear operators We first introduce some notation and backgrounds for linear operators on Hilbert spaces. More theories on linear operators can be found in Dunford and Schwartz (1988). We will restrict our discussion to separable Hilbert spaces. A separable Hilbert space H is a Hilbert space with a countable basis, {e 1, e 2, }. For two Hilbert spaces H and H, a linear operator T : H H is a linear map from H to H, i.e. T (ax) = a (T x), T (x + y) = T x + T y, for any x, y H and any scalar value a. T is bounded if T x H M x H, x H, for some non-negative real number M. Denote the class of bounded linear operator from H to H as B(H, H ), when H = H it is simplified as B(H). 2
3 The adjoint of an operator T B(H, H ), denoted as T, is an operator mapping from H to H, with y, T x H = T y, x H, x H, y H. When H = H, T is called self-adjoint if T = T. 2.2 Hilbertian random variables Let H be a separable Hilbert space with inner product, H, (Ω, F, P ) be a probability space, then a Hilbertian valued random variable is a map X : (Ω, F, P ) H. Since finite dimensional Hilbert space are isomorphic to the Euclidian space, and theories in Multivariate Analysis apply, we are generically interested in random variables on an infinite dimensional Hilbert space. In functional data analysis, H could be L 2 functional space, or Sobleve space, etc. The mean of X, if exits, is defined as µ X = EX = X(ω)dP (ω), which is an element in H satisfying b, EX = E b, X for all b H. The variance of X is an operator on H, defined as V X (g) = E{(X EX) (X EX)}(g) = E{ X EX, g (X EX)}, for g H. It is easy to show that V X is a self-adjoint, nonnegative definite operator. The characteristic function of a Hilbertian random variable is φ X (f) = E{exp(i f, X )}, (1) for all f H. For a separable Hilbert space, there is a countable basis, {e 1, e 2, }. Define x j = e j, X, they are univariate random variables. Then X has the coordinate (x 1, x 2, ), which is a l 2 random variable. Denote the space of H-valued random variables as X, then X is isomorphic to the space of l 2 random variables. An operator T is nuclear if the trace of it is finite and independent of the choice of the basis. The trace of an operator is defined as tr(t ) = e i, T e i, i=1 3
4 for any complete orthonormal basis of H. The covariance operator of X considered in this paper is a self-adjoint, nonnegative definite, nuclear operator, denoted as Γ. Γ has a spectrum decomposition Γ = j λ j ψ j ψ j, (2) where λ 1 λ 2,, 0 are eigenvalues of Γ, ψ j are corresponding eigenvectors. If {ψ j } is incomplete, we can always make them into a complete basis for H by including basis of the null space of Γ. ψ j s are the principal components of the random vector X. Definition 1 A Hilbertian random variable X has an elliptically contoured distribution if the characteristic function of X µ has the form φ X µ (f) = φ 0 ( f, Γf ) for a univariate function φ 0, where Γ is a self-adjoint, non-negative definite, nuclear operator on H. Denote the distribution of X as HEC H (µ, Γ, φ). One important example of elliptically contoured distribution is the Gaussian distribution, whose characteristic function has the form φ X µ (f) = exp( f, Γf /2). 3 Random representation for Hilbertian valued elliptically contoured random variables For a fixed self-adjoint, non-negative definite, nuclear operator Γ, we can define a metric in H, d Γ (x, y) = x y Γ = x y, Γ(x y) 1/2 H. Lemma 2 Suppose φ X µ (f) = φ 0 ( f, Γf ) is the characteristic function of an elliptically contoured distribution, then φ 0 ( ) is a non-negative definite function on H with respect to the metric d Γ (, ), i.e. n a i a j φ 0 {d 2 Γ(f i, f j )} 0, j=1 for any finite collections of {f i ; i = 1,, n} H and for any real values {a i ; i = 1,, n}. 4
5 Lemma 2 is a straight forward application of the Sazanov theorem (Vakhania, Tarieladze and Chobemyan, 1987), which is a generalization of the Bochner s Theorem to the infinity dimensional Hilbert space. By the definition of the characteristic function, one can easily see that φ X µ (f g) = φ X µ (g f), which leads to φ 0 ( f g 2 Γ ) = φ 0( f g 2 Γ ). This means φ 0( ) must be real valued. Theorem 3 X HEC H (µ, Γ, φ), if and only if X = d µ + RU, (3) where U Gaussian(0, Γ), R is a nonnegative univariate random variable independent of U. Proof: We first prove the if part. Suppose (3) is true, and let F be the distribution function of R, then φ X µ (f) = E exp(ir f, U ) = [0, ) exp( r 2 f, Γf /2)dF (r). (4) By Definition 1, X is a Hilbertian-valued elliptically contoured random variable. Conversely, suppose X is an elliptically contoured Hilbertian random variable with the characteristic function φ X µ (f) = φ 0 ( f Γ ), by Lemma 2 g(t) = φ 0 (t 2 ) is a positive definite function. By Theorem 2 in Schoenberg (1938), g(t) = 0 exp( t 2 u 2 )dα(u), for some bounded non-decreasing function α(u). By the definition of the characteristic function, we have 1 = φ 0 (0) = 0 dα(u). Therefore, α( ) is the cumulative distribution function of a non-negative random variable. We now change variable, let t = f Γ and define a random variable R, such that 2 1/2 R has distribution function α( ). Let F be the distribution function of R, then F (r) = α(2 1/2 r). We have φ X µ (f) = φ 0 ( f, Γf ) = Therefore, X has the stochastic representation (3). 0 exp( r 2 f, Γf /2)dF (r), 5
6 4 Properties of elliptically contoured distribution We first discuss moment properties of elliptically contoured distribution. Suppose the first two moments of X exist. By (3), EX = µ,, V (X) = E(R 2 )Γ. On the other hand, if we start from the characteristic function, assuming that φ is twice differentiable, V (X) = φ (2) X µ (0) = 2φ 0( f, Γf )Γ + 4φ (2) 0 ( f, Γf )(Γf) (Γf) f=0 = 2φ 0(0)Γ. To make Γ identifiable, we can let E(R 2 ) = 2φ 0 (0) = 1, then Γ = V (X) is the covariance operator of X. Theorem 4 Let H, H 1 and H 2 be separable Hilbert spaces, suppose X HEC H (µ, Γ, φ 0 ), P 1 B(H, H 1 ), P 2 B(H, H 2 ) are two bounded operators. Define X i = P i X, µ i = P i µ, Γ ij = P i ΓP j, for i, j = 1, 2. Suppose Γ 12 = 0. Then X 1 HEC H1 (µ 1, Γ 11, φ 0 ). If Γ 22 is a finite dimensional operator, then X 1 X 2 HEC H1 (µ 1, Γ 11, φ T (X2 )), (5) where φ T (X2 )(t 2 ) is a non-negative definite function depends on T (X 2 ) = X 2 µ 2, Γ 22 (X 2 µ 2 ) 1/2 H 2, Γ 22 is a generalized inverse of Γ 22. If Γ 22 is an infinite dimensional operator, then X 1 X 2 Gaussian H1 (µ 1, r 2 (X 2 )Γ 11 ), (6) where r( ) is a deterministic function given in (7). Proof: By (3), P i X = P i µ + RU i, where U i = P i U Gaussian(0, Γ ii ), for i = 1, 2. Therefore, X 1 HEC H1 (µ 1, Γ 11, φ 0 ) by Theorem 3. Since Cov(U 1, U 2 ) = P 1 ΓP2 = 0, by the property of Gaussian variables, U 1 is independent of U 2. X 1 depends on X 2 only though the information on R provided by X 2. Suppose Γ 22 is finite dimensional, then X 2 R Gaussian H2 (µ 2, R 2 Γ 22 ). 6
7 Notice that Γ 22 is a finite dimensional operator (matrix) with a generalized inverse Γ 22. From the theory of finite dimensional Gaussian, T (X 2) is a sufficient statistic for R, i.e. X 2 T (X 2 ) is independent of R. Therefore, (5) is obtained by Theorem 3. X 1 X 2 d = P1 µ + U 1 {R T (X 2 )}. On the other hand, if Γ 22 is infinite dimensional, we claim that X 2 provide all information about R. It is is easy to see that Γ 22 is self-adjoint, non-negative definite and nuclear, therefore it has a spectrum decomposition, Γ 22 = λ j ψ j ψ j, j=1 where λ j s are the positive eigenvalues of Γ 22. Define r n (X 2 ) = 1 n n j=1 λ 1/2 j X 2 µ 2, ψ j H2. Notice that X 2 µ 2 = RU 2, and therefore λ 1/2 d j X 2 µ 2, ψ j H2 = RU2j, where U 2j are i.i.d Normal(0, 1) independent of R. By Law of Large Numbers, r(x 2 ) = lim n r n(x 2 ) = R (7) with probability 1. Therefore, X 1 X 2 d = µ1 + r(x 2 )U 1 which is the Gaussian distribution in (6). Theorem 4 gives the conditional distribution of X 1 given X 2 when they are uncorrelated, i.e. Γ 12 = Cov(P 1 X, P 2 X) = 0. The following corollary gives the condition distribution for the more general cases. Corollary 5 Let H, H 1 and H 2 be separable Hilbert spaces, suppose X HEC H (µ, Γ, φ 0 ), P 1 B(H, H 1 ), P 2 B(H, H 2 ), define X i = P i X, µ i = P i µ, Γ ij = P i ΓPj, for i, j = 1, 2. Define µ 1 = µ 1 + Γ 12 Γ 22 (X 2 µ 2 ) and Γ 11 = Γ 11 Γ 12 Γ 22 Γ 21. If Γ 22 is a finite dimensional operator, X 1 X 2 HEC H1 (µ 1, Γ 11, φ T (X2 )), (8) where φ T (X2 )(t 2 ) is a non-negative definite function depends on T (X 2 ) = X 2 µ 2, Γ 22 (X 2 µ 2 ) 1/2 H 2. 7
8 If Γ 22 is an infinite dimensional operator, then X 1 X 2 Gaussian H1 (µ 1, r 2 (X 2 )Γ 11), (9) where r( ) is a deterministic function given in (7). Proof: First of all, Γ ij = Cov(X i, X j ) are bounded operators, like in multivariate analysis, by Cauchy s inequality, we have Γ 11 bounded and positive semidefinite. Also, µ 1 is well defined, since Cov(µ 1 ) = Γ 12Γ 22 Γ 21 is bounded by Γ 11 and therefore P (µ 1 H 1) = 1. Let U i = P i U, i = 1, 2. Since U i s are Gaussian, it is easy to check though moment calculations that (U 1, U 2 ) = d (Z 1 + Γ 12 Γ 22 Z 2, Z 2 ), where Z 1 Gaussian H1 (0, Γ 11 ), Z 2 Gaussian H2 (0, Γ 22 ), and they are independent. Therefore X 1 X 2 d = {µ1 + RZ 1 + Γ 12 Γ 22 (RZ 2)} X 2 d = µ 1 + Γ 12 Γ 22 (X 2 µ 2 ) + RZ 1 X 2 d = µ 1 + Z 1 (R X 2 ). (10) Here we used the fact that Z 1 is independent of X 2. When the range of the operator Γ 22 is finite dimensional, X 2 is also finite dimensional. Use the arguments like in the proof of Theorem 4, we can show that (R X 2 ) is a nonnegative random variable, which depends on value of X 2 only through the statistic T (X 2 ). Then (8) follows from a direct application of Theorem 3. When Γ 22 is of infinite dimension, like in the proof of Theorem 4, (R X 2 ) is a deterministic function r(x 2 ) given by (7). (9) is proved. Remark: Although, in this paper, we are only interested in the case that X is defined on an infinite dimensional Hilbert space H, we do allow the Hilbert spaces H 1 and H 2 that P 1 and P 2 mapped into to be finite dimensional. For example, we allow H 1 and H 2 to be the Euclidian space R m. See our examples in Section 5. 8
9 5 Applications To show the usefulness of our theory in statistical practices, we will provide a few examples which are direct results of the theorems in Section 4. Example 1: (Principal Component Analysis) Suppose X HEC H (µ, Γ, φ 0 ), and the covariance operator Γ has the spectrum decomposition as in (2). ψ j s are the principal components of X. Define the principal component score ξ j = ψ j, X H for j = 1, 2,, then X has the following decomposition X = µ + ξ j ψ j. (11) j=1 Such decomposition in functional analysis is also called the Karhunen-Loéve decomposition (Ash and Gardner, 1975). For any finite collection, {ψ j1,, ψ jm }, define an operator P : H R m by P = ψ j1 e ψ jm e m, where e j are the j th column vector of the identity matrix. Then by Theorem 4, (ξ j1,, ξ jm ) T = P (X µ) HEC R m{0, diag(λ j1,, λ jm ), φ 0 }. In other words, any finite collection of principal component scores of X follows a multivariate elliptically contoured distribution. This example also suggest a way to simulate a HEC H (µ, Γ, φ) random variable. Since λ j 0 as j, we can truncate the series in (11) at a large number m and simulate the first m principal component scores from a multivariate elliptical contoured distribution. This is very useful in simulating functional data. Example 2: (Conditional moments) Let X HEC H (0, Γ, φ), P 1 B(H, H 1 ), P 2 B(H, H 2 ). By Corollary 5, E(P 1 X P 2 X) = µ 1 + Γ 12 Γ 22 (X 2 µ 2 ). Suppose µ = 0, P 1 ΓP2 = 0, then Γ 12 = 0 and we have E(P 1 X P 2 X) = 0. 9
10 On the other hand, by the random representation (10), Var(P 1 X P 2 X) = Γ 11E(R 2 X 2 ). When Γ 22 is a finite dimensional operator, E(R 2 X 2 ) = g{t (X 2 )} for some univariate function g depends on the elliptically contoured distribution, T (X 2 ) is defined in Theorem 4. Therefore Var(P 1 X P 2 X) = g{t (X 2 )}Γ 11. When Γ 22 is an infinite dimensional operator, by (9), Var(P 1 X P 2 X) = r 2 (X 2 )Γ 11. If P 1 ΓP2 = 0, Γ 11 = Γ 11. Example 3: (Functional sliced inverse regression) Suppose the Hilbert space H is the L 2 [0, 1] functional space, X H are random functions defined on the [0, 1] interval. A general functional regression model is given by Y = f( β 1, X, β 2, X,, β K, X, ɛ), (12) where Y is a scalar response variable, ɛ is the error term independent with X, β 1,, β K are linearly independent coefficient function, f is a nonparametric link function. Model (12) is very general, it can be very useful in many applications, see the discussion in Ferré and Yao (2003) and Li and Hsing (2007). Since we do not impose any structure on the link function f, the coefficient functions β k s are usually unidentifiable, but the subspace spanned by these function is. This subspace is called the Effective Dimension Reduction space, or the EDR space. We can chose any K orthonormal basis functions in the EDR space as β k s, these functions are also called the EDR directions. The functional sliced inverse regression (FSIR) approach can be used to estimate the EDR directions and to decide the dimension of EDR space. We will show that the class of process X with a Hilbertian Elliptically Contoured distribution satisfies a key assumption for (FSIR), and we will discuss an important result for elliptically contoured functional predictor, which is useful for FSIR. For a more comprehensive 10
11 account for the method and theory of FSIR, we refer to Ferré and Yao (2003), Li and Hsing (2007). One key assumption for FSIR is that, for any β 0 H, then K E( β 0, X β 1, X, β 2, X,, β K, X ) = c 0 + c k β k, X (13) for some constants c 0,, c K, see Ferré and Yao (2003). We will show this assumption is satisfied if X is elliptically contoured. Define operators P 1 x = β 0, x, P 2 x = ( β 1, x, β 2, x,, β K, x ) T for x H. Notice that H 2 = R K, and P 2 is clearly a finite dimensional operator. For any vector v = (v 1,, v K ) T, K P2 v, x H = v, P 2 x H2 = v k β k, x, x H. Therefore, P2 v = (β 1,, β K )v. One can also show that Γ 22 is a K K matrix with the (j, k) th entry equal to β j, Γβ k. Similarly, Γ 12 = ( β 0, Γβ 1,, β 0, Γβ K ) is a 1 K matrix. By Corollary 5, E(P 1 X P 2 X) = P 1 µ + Γ 12 Γ 22 (P 2X P 2 µ). Therefore, assumption (13) is satisfied. k=1 k=1 Example 4: (Functional sliced inverse regression, continued) Suppose X is elliptically contoured, with mean µ = 0. Let P 2 be the operator defined as in the previous example, define the operator P 1 x = ( γ 1, x,, γ m, x ) T for a set of orthonormal vectors in H, {γ 1,, γ m }. Suppose P 1 ΓP2 = 0, i.e. γ j, Γβ k = 0 for j = 1,, m, k = 1,, K. By model (12), all the information in Y about X are contained in P 2 X, we have E(P 1 X Y ) = E{E(P 1 X P 2 X) Y }. By Example 2, E(P 1 X P 2 X) = 0, therefore P 1 E(X Y ) = 0. (14) 11
12 Equation (14) provides information about the shape of the inverse regression curve E(X Y ). On the other hand, Var(P 1 X Y ) = E{Var(P 1 X P 2 X) Y } + Var{E(P 1 X P 2 X) Y } = E{Var(P 1 X P 2 X) Y }. (by E(P 1 X P 2 X) = 0) Again, by Example 2, Var(P 1 X P 2 X) = g{t (P 2 X)}Γ 11, therefore Var(P 1 X Y ) = E[g{T (P 2 X)} Y ]Γ 11. Since Γ 11 is the marginal covariance for P 1 X, this result shows that the conditional covariance of P 1 X given Y is proportional to the the marginal covariance. This result is important to constructing tests for FSIR. References [1] Ash, R. B. and Gardner, M. F. (1975). Topics in stochastic processes, Academic press. [2] Cambanis, S., Huang, S. and Simons, G. (1981). On the theory of elliptically contoured distributions, Journal of Multivariate Analysis, 11, [3] Cook, D. R. and Weisberg, S. (1991). Comments on Sliced Inverse Regression for Dimension Reduction, by K. C. Li, Journal of the American Statistical Association, 86, [4] Eaton, M. L. (1986). A characterization of spherical distributions, Journal of Multivariate Analysis, 20, [5] Ferré, L. and Yao, A. (2003). Functional sliced inverse regression analysis. Statistics, 37, [6] Li, K. (1991). Sliced inverse regression for dimension reduction. Journal of the American Statistical Association, 86, 414, [7] Li, Y. and Hsing, T. (2007). Determination of the Dimensionality in Functional Sliced Inverse Regression, manuscript. 12
13 [8] Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd Edition. Springer-Verlag, New York. [9] Schoenberg, I. J. (1938). Metric spaces and completely monotone functions, Annals of Mathematics, 39, No.4, [10] Vakhania, N. N., Tarieladze, V. I. and Chobemyan, S.A. (1987). Probability Distributions on Banach Spaces. D. Reidel, Dordrecht. 13
Karhunen-Loève decomposition of Gaussian measures on Banach spaces
Karhunen-Loève decomposition of Gaussian measures on Banach spaces Jean-Charles Croix jean-charles.croix@emse.fr Génie Mathématique et Industriel (GMI) First workshop on Gaussian processes at Saint-Etienne
More informationx. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).
.8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics
More informationLecture 3: Review of Linear Algebra
ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,
More informationThe following definition is fundamental.
1. Some Basics from Linear Algebra With these notes, I will try and clarify certain topics that I only quickly mention in class. First and foremost, I will assume that you are familiar with many basic
More informationPOSITIVE DEFINITE FUNCTIONS AND MULTIDIMENSIONAL VERSIONS OF RANDOM VARIABLES
POSITIVE DEFINITE FUNCTIONS AND MULTIDIMENSIONAL VERSIONS OF RANDOM VARIABLES ALEXANDER KOLDOBSKY Abstract. We prove that if a function of the form f( K is positive definite on, where K is an origin symmetric
More informationLecture 3: Review of Linear Algebra
ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak, scribe: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters,
More informationKarhunen-Loève decomposition of Gaussian measures on Banach spaces
Karhunen-Loève decomposition of Gaussian measures on Banach spaces Jean-Charles Croix GT APSSE - April 2017, the 13th joint work with Xavier Bay. 1 / 29 Sommaire 1 Preliminaries on Gaussian processes 2
More information5 Compact linear operators
5 Compact linear operators One of the most important results of Linear Algebra is that for every selfadjoint linear map A on a finite-dimensional space, there exists a basis consisting of eigenvectors.
More informationFinite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product
Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )
More informationIndependent component analysis for functional data
Independent component analysis for functional data Hannu Oja Department of Mathematics and Statistics University of Turku Version 12.8.216 August 216 Oja (UTU) FICA Date bottom 1 / 38 Outline 1 Probability
More informationSecond-Order Inference for Gaussian Random Curves
Second-Order Inference for Gaussian Random Curves With Application to DNA Minicircles Victor Panaretos David Kraus John Maddocks Ecole Polytechnique Fédérale de Lausanne Panaretos, Kraus, Maddocks (EPFL)
More informationSPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS
SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS G. RAMESH Contents Introduction 1 1. Bounded Operators 1 1.3. Examples 3 2. Compact Operators 5 2.1. Properties 6 3. The Spectral Theorem 9 3.3. Self-adjoint
More informationAnalysis Preliminary Exam Workshop: Hilbert Spaces
Analysis Preliminary Exam Workshop: Hilbert Spaces 1. Hilbert spaces A Hilbert space H is a complete real or complex inner product space. Consider complex Hilbert spaces for definiteness. If (, ) : H H
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationSliced Inverse Regression
Sliced Inverse Regression Ge Zhao gzz13@psu.edu Department of Statistics The Pennsylvania State University Outline Background of Sliced Inverse Regression (SIR) Dimension Reduction Definition of SIR Inversed
More informationFunctional Latent Feature Models. With Single-Index Interaction
Generalized With Single-Index Interaction Department of Statistics Center for Statistical Bioinformatics Institute for Applied Mathematics and Computational Science Texas A&M University Naisyin Wang and
More information08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms
(February 24, 2017) 08a. Operators on Hilbert spaces Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/real/notes 2016-17/08a-ops
More informationMathematical foundations - linear algebra
Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar
More informationarxiv: v1 [math.st] 11 Nov 2010
The Annals of Statistics 2010, Vol. 38, No. 5, 3028 3062 DOI: 10.1214/10-AOS816 c Institute of Mathematical Statistics, 2010 arxiv:1011.2620v1 [math.st] 11 Nov 2010 DECIDING THE DIMENSION OF EFFECTIVE
More informationNotes on Random Vectors and Multivariate Normal
MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution
More informationElliptically Contoured Distributions
Elliptically Contoured Distributions Recall: if X N p µ, Σ), then { 1 f X x) = exp 1 } det πσ x µ) Σ 1 x µ) So f X x) depends on x only through x µ) Σ 1 x µ), and is therefore constant on the ellipsoidal
More informationElements of Positive Definite Kernel and Reproducing Kernel Hilbert Space
Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space Statistical Inference with Reproducing Kernel Hilbert Space Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department
More informationA Selective Review of Sufficient Dimension Reduction
A Selective Review of Sufficient Dimension Reduction Lexin Li Department of Statistics North Carolina State University Lexin Li (NCSU) Sufficient Dimension Reduction 1 / 19 Outline 1 General Framework
More informationReview (Probability & Linear Algebra)
Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint
More informationarxiv: v2 [math.pr] 27 Oct 2015
A brief note on the Karhunen-Loève expansion Alen Alexanderian arxiv:1509.07526v2 [math.pr] 27 Oct 2015 October 28, 2015 Abstract We provide a detailed derivation of the Karhunen Loève expansion of a stochastic
More informationA Concise Course on Stochastic Partial Differential Equations
A Concise Course on Stochastic Partial Differential Equations Michael Röckner Reference: C. Prevot, M. Röckner: Springer LN in Math. 1905, Berlin (2007) And see the references therein for the original
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More informationNORMS ON SPACE OF MATRICES
NORMS ON SPACE OF MATRICES. Operator Norms on Space of linear maps Let A be an n n real matrix and x 0 be a vector in R n. We would like to use the Picard iteration method to solve for the following system
More informationOrthonormal Systems. Fourier Series
Yuliya Gorb Orthonormal Systems. Fourier Series October 31 November 3, 2017 Yuliya Gorb Orthonormal Systems (cont.) Let {e α} α A be an orthonormal set of points in an inner product space X. Then {e α}
More informationRecall that any inner product space V has an associated norm defined by
Hilbert Spaces Recall that any inner product space V has an associated norm defined by v = v v. Thus an inner product space can be viewed as a special kind of normed vector space. In particular every inner
More informationProbability Lecture III (August, 2006)
robability Lecture III (August, 2006) 1 Some roperties of Random Vectors and Matrices We generalize univariate notions in this section. Definition 1 Let U = U ij k l, a matrix of random variables. Suppose
More informationKernel Method: Data Analysis with Positive Definite Kernels
Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University
More informationSTUDY PLAN MASTER IN (MATHEMATICS) (Thesis Track)
STUDY PLAN MASTER IN (MATHEMATICS) (Thesis Track) I. GENERAL RULES AND CONDITIONS: 1- This plan conforms to the regulations of the general frame of the Master programs. 2- Areas of specialty of admission
More information1 Principal component analysis and dimensional reduction
Linear Algebra Working Group :: Day 3 Note: All vector spaces will be finite-dimensional vector spaces over the field R. 1 Principal component analysis and dimensional reduction Definition 1.1. Given an
More informationMultivariate Random Variable
Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate
More informationStable Process. 2. Multivariate Stable Distributions. July, 2006
Stable Process 2. Multivariate Stable Distributions July, 2006 1. Stable random vectors. 2. Characteristic functions. 3. Strictly stable and symmetric stable random vectors. 4. Sub-Gaussian random vectors.
More informationGaussian Random Fields
Gaussian Random Fields Mini-Course by Prof. Voijkan Jaksic Vincent Larochelle, Alexandre Tomberg May 9, 009 Review Defnition.. Let, F, P ) be a probability space. Random variables {X,..., X n } are called
More informationLecture 7 Spectral methods
CSE 291: Unsupervised learning Spring 2008 Lecture 7 Spectral methods 7.1 Linear algebra review 7.1.1 Eigenvalues and eigenvectors Definition 1. A d d matrix M has eigenvalue λ if there is a d-dimensional
More informationRecall the convention that, for us, all vectors are column vectors.
Some linear algebra Recall the convention that, for us, all vectors are column vectors. 1. Symmetric matrices Let A be a real matrix. Recall that a complex number λ is an eigenvalue of A if there exists
More informationPrincipal Components Theory Notes
Principal Components Theory Notes Charles J. Geyer August 29, 2007 1 Introduction These are class notes for Stat 5601 (nonparametrics) taught at the University of Minnesota, Spring 2006. This not a theory
More informationBasic Concepts in Matrix Algebra
Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1
More information5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.
88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal
More informationHilbert Spaces. Contents
Hilbert Spaces Contents 1 Introducing Hilbert Spaces 1 1.1 Basic definitions........................... 1 1.2 Results about norms and inner products.............. 3 1.3 Banach and Hilbert spaces......................
More informationj=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.
Lecture Notes: Orthogonal and Symmetric Matrices Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk Orthogonal Matrix Definition. Let u = [u
More informationAN INTRODUCTION TO THEORETICAL PROPERTIES OF FUNCTIONAL PRINCIPAL COMPONENT ANALYSIS. Ngoc Mai Tran Supervisor: Professor Peter G.
AN INTRODUCTION TO THEORETICAL PROPERTIES OF FUNCTIONAL PRINCIPAL COMPONENT ANALYSIS Ngoc Mai Tran Supervisor: Professor Peter G. Hall Department of Mathematics and Statistics, The University of Melbourne.
More informationHILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define
HILBERT SPACES AND THE RADON-NIKODYM THEOREM STEVEN P. LALLEY 1. DEFINITIONS Definition 1. A real inner product space is a real vector space V together with a symmetric, bilinear, positive-definite mapping,
More informationCommon-Knowledge / Cheat Sheet
CSE 521: Design and Analysis of Algorithms I Fall 2018 Common-Knowledge / Cheat Sheet 1 Randomized Algorithm Expectation: For a random variable X with domain, the discrete set S, E [X] = s S P [X = s]
More informationChapter 8 Integral Operators
Chapter 8 Integral Operators In our development of metrics, norms, inner products, and operator theory in Chapters 1 7 we only tangentially considered topics that involved the use of Lebesgue measure,
More informationMATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.
MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators. Adjoint operator and adjoint matrix Given a linear operator L on an inner product space V, the adjoint of L is a transformation
More informationGaussian Processes. 1. Basic Notions
Gaussian Processes 1. Basic Notions Let T be a set, and X : {X } T a stochastic process, defined on a suitable probability space (Ω P), that is indexed by T. Definition 1.1. We say that X is a Gaussian
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationDAVIS WIELANDT SHELLS OF NORMAL OPERATORS
DAVIS WIELANDT SHELLS OF NORMAL OPERATORS CHI-KWONG LI AND YIU-TUNG POON Dedicated to Professor Hans Schneider for his 80th birthday. Abstract. For a finite-dimensional operator A with spectrum σ(a), the
More informationIntroduction to Infinite Dimensional Stochastic Analysis
Introduction to Infinite Dimensional Stochastic Analysis By Zhi yuan Huang Department of Mathematics, Huazhong University of Science and Technology, Wuhan P. R. China and Jia an Yan Institute of Applied
More informationA note on the σ-algebra of cylinder sets and all that
A note on the σ-algebra of cylinder sets and all that José Luis Silva CCM, Univ. da Madeira, P-9000 Funchal Madeira BiBoS, Univ. of Bielefeld, Germany (luis@dragoeiro.uma.pt) September 1999 Abstract In
More informationElementary linear algebra
Chapter 1 Elementary linear algebra 1.1 Vector spaces Vector spaces owe their importance to the fact that so many models arising in the solutions of specific problems turn out to be vector spaces. The
More informationFUNCTIONAL DATA ANALYSIS. Contribution to the. International Handbook (Encyclopedia) of Statistical Sciences. July 28, Hans-Georg Müller 1
FUNCTIONAL DATA ANALYSIS Contribution to the International Handbook (Encyclopedia) of Statistical Sciences July 28, 2009 Hans-Georg Müller 1 Department of Statistics University of California, Davis One
More informationReal Analysis Notes. Thomas Goller
Real Analysis Notes Thomas Goller September 4, 2011 Contents 1 Abstract Measure Spaces 2 1.1 Basic Definitions........................... 2 1.2 Measurable Functions........................ 2 1.3 Integration..............................
More informationDS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.
DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1
More informationGaussian Hilbert spaces
Gaussian Hilbert spaces Jordan Bell jordan.bell@gmail.com Department of Mathematics, University of Toronto July 11, 015 1 Gaussian measures Let γ be a Borel probability measure on. For a, if γ = δ a then
More informationThe Multivariate Gaussian Distribution
The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance
More informationRecitation 1 (Sep. 15, 2017)
Lecture 1 8.321 Quantum Theory I, Fall 2017 1 Recitation 1 (Sep. 15, 2017) 1.1 Simultaneous Diagonalization In the last lecture, we discussed the situations in which two operators can be simultaneously
More informationConcentration Ellipsoids
Concentration Ellipsoids ECE275A Lecture Supplement Fall 2008 Kenneth Kreutz Delgado Electrical and Computer Engineering Jacobs School of Engineering University of California, San Diego VERSION LSECE275CE
More informationCourse Description - Master in of Mathematics Comprehensive exam& Thesis Tracks
Course Description - Master in of Mathematics Comprehensive exam& Thesis Tracks 1309701 Theory of ordinary differential equations Review of ODEs, existence and uniqueness of solutions for ODEs, existence
More informationCollocation based high dimensional model representation for stochastic partial differential equations
Collocation based high dimensional model representation for stochastic partial differential equations S Adhikari 1 1 Swansea University, UK ECCM 2010: IV European Conference on Computational Mechanics,
More informationFunctional Analysis Review
Outline 9.520: Statistical Learning Theory and Applications February 8, 2010 Outline 1 2 3 4 Vector Space Outline A vector space is a set V with binary operations +: V V V and : R V V such that for all
More informationHierarchical Modeling for Univariate Spatial Data
Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This
More informationMultivariate Distributions
Copyright Cosma Rohilla Shalizi; do not distribute without permission updates at http://www.stat.cmu.edu/~cshalizi/adafaepov/ Appendix E Multivariate Distributions E.1 Review of Definitions Let s review
More informationYour first day at work MATH 806 (Fall 2015)
Your first day at work MATH 806 (Fall 2015) 1. Let X be a set (with no particular algebraic structure). A function d : X X R is called a metric on X (and then X is called a metric space) when d satisfies
More informationLinear Algebra Review
January 29, 2013 Table of contents Metrics Metric Given a space X, then d : X X R + 0 and z in X if: d(x, y) = 0 is equivalent to x = y d(x, y) = d(y, x) d(x, y) d(x, z) + d(z, y) is a metric is for all
More informationMath 307 Learning Goals
Math 307 Learning Goals May 14, 2018 Chapter 1 Linear Equations 1.1 Solving Linear Equations Write a system of linear equations using matrix notation. Use Gaussian elimination to bring a system of linear
More informationUNIQUENESS OF POSITIVE SOLUTION TO SOME COUPLED COOPERATIVE VARIATIONAL ELLIPTIC SYSTEMS
TRANSACTIONS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 00, Number 0, Pages 000 000 S 0002-9947(XX)0000-0 UNIQUENESS OF POSITIVE SOLUTION TO SOME COUPLED COOPERATIVE VARIATIONAL ELLIPTIC SYSTEMS YULIAN
More informationSum-of-Squares Method, Tensor Decomposition, Dictionary Learning
Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning David Steurer Cornell Approximation Algorithms and Hardness, Banff, August 2014 for many problems (e.g., all UG-hard ones): better guarantees
More information(Multivariate) Gaussian (Normal) Probability Densities
(Multivariate) Gaussian (Normal) Probability Densities Carl Edward Rasmussen, José Miguel Hernández-Lobato & Richard Turner April 20th, 2018 Rasmussen, Hernàndez-Lobato & Turner Gaussian Densities April
More information(v, w) = arccos( < v, w >
MA322 Sathaye Notes on Inner Products Notes on Chapter 6 Inner product. Given a real vector space V, an inner product is defined to be a bilinear map F : V V R such that the following holds: For all v
More informationAn Introduction to Multivariate Statistical Analysis
An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents
More informationMultivariate Time Series: VAR(p) Processes and Models
Multivariate Time Series: VAR(p) Processes and Models A VAR(p) model, for p > 0 is X t = φ 0 + Φ 1 X t 1 + + Φ p X t p + A t, where X t, φ 0, and X t i are k-vectors, Φ 1,..., Φ p are k k matrices, with
More informationGAUSSIAN PROCESSES; KOLMOGOROV-CHENTSOV THEOREM
GAUSSIAN PROCESSES; KOLMOGOROV-CHENTSOV THEOREM STEVEN P. LALLEY 1. GAUSSIAN PROCESSES: DEFINITIONS AND EXAMPLES Definition 1.1. A standard (one-dimensional) Wiener process (also called Brownian motion)
More informationOperators with numerical range in a closed halfplane
Operators with numerical range in a closed halfplane Wai-Shun Cheung 1 Department of Mathematics, University of Hong Kong, Hong Kong, P. R. China. wshun@graduate.hku.hk Chi-Kwong Li 2 Department of Mathematics,
More informationI teach myself... Hilbert spaces
I teach myself... Hilbert spaces by F.J.Sayas, for MATH 806 November 4, 2015 This document will be growing with the semester. Every in red is for you to justify. Even if we start with the basic definition
More informationMath 307 Learning Goals. March 23, 2010
Math 307 Learning Goals March 23, 2010 Course Description The course presents core concepts of linear algebra by focusing on applications in Science and Engineering. Examples of applications from recent
More information18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013. [ ] variance: E[X] =, and Cov[X] = Σ = =
18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013 1. Consider a bivariate random variable: [ ] X X = 1 X 2 with mean and co [ ] variance: [ ] [ α1 Σ 1,1 Σ 1,2 σ 2 ρσ 1 σ E[X] =, and Cov[X]
More informationBivariate Splines for Spatial Functional Regression Models
Bivariate Splines for Spatial Functional Regression Models Serge Guillas Department of Statistical Science, University College London, London, WC1E 6BTS, UK. serge@stats.ucl.ac.uk Ming-Jun Lai Department
More informationWhitening and Coloring Transformations for Multivariate Gaussian Data. A Slecture for ECE 662 by Maliha Hossain
Whitening and Coloring Transformations for Multivariate Gaussian Data A Slecture for ECE 662 by Maliha Hossain Introduction This slecture discusses how to whiten data that is normally distributed. Data
More informationA Limit Theorem for the Squared Norm of Empirical Distribution Functions
University of Wisconsin Milwaukee UWM Digital Commons Theses and Dissertations May 2014 A Limit Theorem for the Squared Norm of Empirical Distribution Functions Alexander Nerlich University of Wisconsin-Milwaukee
More informationFive Mini-Courses on Analysis
Christopher Heil Five Mini-Courses on Analysis Metrics, Norms, Inner Products, and Topology Lebesgue Measure and Integral Operator Theory and Functional Analysis Borel and Radon Measures Topological Vector
More informationVector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.
Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar
More informationFunctional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...
Functional Analysis Franck Sueur 2018-2019 Contents 1 Metric spaces 1 1.1 Definitions........................................ 1 1.2 Completeness...................................... 3 1.3 Compactness......................................
More informationLecture notes: Applied linear algebra Part 1. Version 2
Lecture notes: Applied linear algebra Part 1. Version 2 Michael Karow Berlin University of Technology karow@math.tu-berlin.de October 2, 2008 1 Notation, basic notions and facts 1.1 Subspaces, range and
More informationA characterization of elliptical distributions and some optimality properties of principal components for functional data
A characterization of elliptical distributions and some optimality properties of principal components for functional data Graciela Boente,a, Matías Salibián Barrera b, David E. Tyler c a Facultad de Ciencias
More informationStochastic Design Criteria in Linear Models
AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 211 223 Stochastic Design Criteria in Linear Models Alexander Zaigraev N. Copernicus University, Toruń, Poland Abstract: Within the framework
More informationSpectral Continuity Properties of Graph Laplacians
Spectral Continuity Properties of Graph Laplacians David Jekel May 24, 2017 Overview Spectral invariants of the graph Laplacian depend continuously on the graph. We consider triples (G, x, T ), where G
More informationCHAPTER VIII HILBERT SPACES
CHAPTER VIII HILBERT SPACES DEFINITION Let X and Y be two complex vector spaces. A map T : X Y is called a conjugate-linear transformation if it is a reallinear transformation from X into Y, and if T (λx)
More information. Find E(V ) and var(v ).
Math 6382/6383: Probability Models and Mathematical Statistics Sample Preliminary Exam Questions 1. A person tosses a fair coin until she obtains 2 heads in a row. She then tosses a fair die the same number
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2
MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is
More informationTheorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1
Chapter 2 Probability measures 1. Existence Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension to the generated σ-field Proof of Theorem 2.1. Let F 0 be
More informationFall 2016 MATH*1160 Final Exam
Fall 2016 MATH*1160 Final Exam Last name: (PRINT) First name: Student #: Instructor: M. R. Garvie Dec 16, 2016 INSTRUCTIONS: 1. The exam is 2 hours long. Do NOT start until instructed. You may use blank
More informationVectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =
Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.
More informationMath 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination
Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column
More informationPrincipal Component Analysis -- PCA (also called Karhunen-Loeve transformation)
Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations
More informationSemidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 2
Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 2 Instructor: Farid Alizadeh Scribe: Xuan Li 9/17/2001 1 Overview We survey the basic notions of cones and cone-lp and give several
More information