Multivariate Statistics

Size: px
Start display at page:

Download "Multivariate Statistics"

Transcription

1 Multivariate Statistics Chapter 2: Multivariate distributions and inference Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid Course 2016/2017 Master in Mathematical Engineering Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 1 / 92

2 Chapter outline 1 Introduction. 2 Basic concepts. 3 Multivariate distributions. 4 Statistical inference. 5 Hypothesis testing. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 2 / 92

3 Introduction Multivariate statistical analysis is concerned with analysing and understanding data in high dimensions. Therefore, we assume that we are given a set of n observations of a multivariate random variable x in R p. Thus, each observation has p dimensions and it is an observed value of the multivariate random variable x that is composed of p random variables: x = (x 1,..., x p ) where x j, for j = 1,..., p is a univariate random variable. In this chapter we give an introduction to the basic probability tools useful in statistical multivariate analysis. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 3 / 92

4 Introduction In particular, we present: the basic probability tools used to describe a multivariate random variable, including marginal and conditional distributions and the concept of independence; the mean vector, the covariance matrix and the correlation matrix of a multivariate random variable and their counterparts for marginal and conditional distributions; the basic techniques needed to derive the distribution of transformations with special emphasis on linear transformations; several multivariate distributions, including the multivariate Gaussian distribution, along with most of its companion distributions and other interesting alternatives; and statistical inference for multivariate samples, including parameter estimation and hypothesis testing. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 4 / 92

5 Basic concepts We can say that we have the joint distribution of a multivariate random variable when the following are specified: 1 The sample space of the possible values, which, in general, is a subset of R p. 2 The probabilities of each possible result of the sample space. We say that a p-dimensional random variable is discrete when each of the p scalar variables that comprise it are discrete as well. Analogously, we say that the variable is continuous if its components are continuous as well. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 5 / 92

6 Basic concepts Let x = (x 1,..., x p ) be a multivariate random variable. The cumulative distribution function (cdf) of x at a point x 0 = ( x 0 1,..., x 0 p ), is denoted by F x ( x 0 ) and is given by: ( F x x 0 ) = Pr ( x x 0) = Pr ( x 1 x1 0,..., x p xp 0 ) Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 6 / 92

7 Basic concepts For continuous multivariate random variables, a nonnegative probability density function (pdf) f x exists, such that: F x ( x 0 ) = x 0 p x 0 1 f x (x 1,..., x p ) dx 1 dx p Note that: f x (x 1,..., x p ) dx 1 dx p = 1 Note also that the cdf F x is differentiable with: f x (x) = p F x (x) x 1 x p Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 7 / 92

8 Basic concepts For discrete multivariate random variables, the values of the random variable are concentrated on a countable or finite set of points {c j } j J. The probability of events of the form x D, for a certain set D J can be computed as: Pr (x D) = Pr (x = c j ) j:c j D For simplicity we will focus on continuous multivariate random variables. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 8 / 92

9 Basic concepts The marginal density function of a subset of the elements of x, say ( ), x i1,..., x ij is given by: ( ) f xi1,...,x xi1 ij,..., x ij = i 1,...,i j f x (x 1,..., x p ) dx 1 dx p i 1,...,i j In particular, the marginal density function of each x j, for j = 1,..., p is given by: f xj (x j ) = j f x (x 1,..., x p ) dx 1 dx p j Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 9 / 92

10 Basic concepts Let x = (x 1,..., x p ) and y = (y 1,..., y q ) be two multivariate random variables with density functions f x and f y, respectively, and joint cumulative density function f x,y. Then, the conditional density function of y given x is given by: f y x (y x) = f x,y (x, y) f x (x) Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 10 / 92

11 Basic concepts From the previous definition, we can deduce that the pdf of (x, y) is given by: As a consequence: f x,y (x, y) = f y x (y x) f x (x) = f x y (x y) f y (y) f y x (y x) = f x y (x y) f y (y) f x (x) = f x y (x y) f y (y) fx,y (x, y) dy = f x y (x y) f y (y) fx y (x y) f y (y) dy This is the Bayes Theorem, one of the most important results in Statistics as it is the base of Bayesian inference. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 11 / 92

12 Basic concepts The multivariate random variables x and y are independent if, and only if: f x,y (x, y) = f x (x) f y (y) Therefore, if x and y are independent, then: f y x (y x) = f y (y) and, f x y (x y) = f x (x) Independence can be interpreted as follows: knowing y = y 0 does not change the probability assessments on x, and conversely. In general, the p univariate random variables x 1,..., x p are independent if, and only if: f x1,...,x p (x 1,..., x p ) = f x1 (x 1 ) f xp (x p ) Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 12 / 92

13 Basic concepts It is important to note that different multivariate pdf may have the same marginal pdf s. For instance, it is easy to see that the bivariate pdf s given by: and, f x1,x 2 (x 1, x 2 ) = 1, 0 < x 1, x 2 < 1 f x1,x 2 (x 1, x 2 ) = (2x 1 1) (2x 2 1), 0 < x 1, x 2 < 1 have the marginals pdf given by: f x1 (x 1 ) = 1, 0 < x 1 < 1 and, respectively. f x2 (x 2 ) = 1, 0 < x 2 < 1 Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 13 / 92

14 Basic concepts An elegant concept of connecting marginals with joint cdf s is given by copulae. For simplicity of presentation we concentrate on the p = 2 dimensional case. A 2-dimensional copula is a function C : [0, 1] 2 properties: 1 For every u [0, 1] : C (0, u) = C (u, 0) = 0. [0, 1] with the following 2 For every u [0, 1] : C (1, u) = C (u, 1) = u. 3 For every (u 1, u 2), (v 1, v 2) [0, 1] [0, 1] with u 1 v 1 and u 2 v 2: C (v 1, v 2) C (v 1, u 2) C (u 1, v 2) + C (u 1, u 2) 0 Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 14 / 92

15 Basic concepts The usefulness of a copula function C is explained by the Sklar s Theorem. Sklar s Theorem Let F x be a multivariate cdf with marginal cdf s F x1 and F x2. Then, a copula C x1,x 2 exists with: F x1,x 2 (x 1, x 2 ) = C x1,x 2 (F x1 (x 1 ), F x2 (x 2 )) for every x 1, x 2 R 2. If F x1 and F x2 are continuous, then C x1,x 2 is unique. On the other hand, if C x1,x 2 is a copula and F x1 and F x2 are cdf s, then the function F x1,x 2 defined above, is a multivariate cdf with marginals F x1 and F x2. Therefore, a copula function links a multivariate distribution to its one-dimensional marginals. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 15 / 92

16 Basic concepts Theorem Let x 1 and x 2 be random variables with cdf s F x1 and F x2, and multivariate cdf F x1,x 2. Then, x 1 and x 2 are independent if and only if: C x1,x 2 (F x1, F x2 ) = F x1 F x2 The previous copula function is called the independence copula. Other copula functions will be given in this chapter. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 16 / 92

17 Basic concepts Let x = (x 1,..., x p ) be a multivariate random variable. The expectation or mean vector of x, is the vector µ x whose components are the expectations or means of the components of the random variable, i.e.: µ x = E [x] = E [x 1 ]. E [x p ] where, E [x j ] = x j f xj (x j ) dx j and f xj (x j ) is the marginal density function of x j. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 17 / 92

18 Basic concepts The covariance matrix of the multivariate random variable x with mean vector µ x, is a symmetric and semidefinite positive matrix given by: Σ x = E [ (x µ x ) (x µ x ) ] The diagonal elements of Σ x are the variances of the components given by, for j = 1,..., p. σ 2 x,j = (x j µ x,j ) 2 f xj (x j ) dx j, The elements outside the main diagonal are the covariances between pairs of variables, σ x,jk = for j, k = 1,..., p. (x j µ x,j ) (x k µ x,k ) f xj,x k (x j, x k ) dx j dx k, Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 18 / 92

19 Basic concepts The correlation matrix of the multivariate random variable x with covariance matrix Σ x is given by: ϱ x = 1/2 x Σ x 1/2 x where x is a diagonal matrix with the variances of the components of x. The elements outside the main diagonal are the correlations between pairs of variables, given by: ρ x,jk = σ x,jk σ x,j σ x,k Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 19 / 92

20 Basic concepts Let x = (x 1,..., x p ) be a multivariate random variable and let ( x i1,..., x ij ) be a subset of the elements of x. Then, the mean vector and the covariance and correlation matrices of ( x i1,..., x ij ) are obtained by extracting the corresponding elements of the mean vector and the covariance and correlation matrices of x. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 20 / 92

21 Basic concepts Let x = (x 1,..., x p ) and y = (y 1,..., y q ) be two random variables with density functions f x and f y, respectively, and let f y x be the conditional density function of y given x. The conditional expectation of y given x is given by: E y x [y x] = yf y x (y x) dy which depends on x. An important property of E y x [y x] is that E y [y] = E x [ Ey x [y x] ]. Then, to compute E y [y], we can first compute E y x [y x] and then, take the expectation with respect to the distribution of x. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 21 / 92

22 Basic concepts Similarly, the conditional covariance and correlation matrices are the covariance and correlation matrices of the multivariate random variable y x. In particular, the condicional covariance matrix contains the conditional variances, Var yj x [y j x] and the conditional covariances, Cov yj,y k x [y j, y k x]. An important property of Var yj x [y j x] is that: [ Var yj [y j ] = E x Varyj x [y j x] ] [ + Var yj x Eyj x [y j x] ]. This is usually called the law of total variance. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 22 / 92

23 Basic concepts Let x = (x 1,..., x p ) and y = (y 1,..., y q ) be two multivariate random variables with mean vectors µ x and µ y and covariance matrices Σ x and Σ y, respectively. The covariance matrix between x = (x 1,..., x p ) and y = (y 1,..., y q ) is a p q matrix given by: Cov [x, y] = E [ (x µ x ) (y µ y ) ] Similarly, the correlation matrix between x = (x 1,..., x p ) and y = (y 1,..., y q ) is a p q matrix given by: Cor [x, y] = 1/2 x Cov [x, y] 1/2 y where x and y are diagonal matrices with elements the diagonal elements of Σ x and Σ y, respectively. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 23 / 92

24 Basic concepts Let x = (x 1,..., x p ) be a multivariate variable with pdf f x and let y = (y 1,..., y p ) a new variable given by: y = g (x) where g is a function with differentiable inverse given by: x = g 1 (y) = h (y) Therefore, the pdf of y is given by: ( ) ( ) f y (y) = f x (x) x det = f x (h (y)) h (y) y det y where x y is the Jacobian of the transformation, det ( ) stands for determinant and denotes the absolute value function. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 24 / 92

25 Basic concepts Consider the particular case of a linear transformation, y = Ax + b, where A is a non-singular p p matrix and b is a p 1 vector. Then, we have that x = A 1 (y b) while x y = A 1. Therefore: f y (y) = f x ( A 1 (y b) ) A 1 Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 25 / 92

26 Basic concepts The previous case only consider transformation from a p-dimensional random variable to another p-dimensional random variable. The case of transformations from a p-dimensional random variable to an q- dimensional random variable, with p q is more difficult to handle. Therefore, we focus on the mean vector and the covariance matrix of the transformed random variable. Let x = (x 1,..., x p ) be a multivariate random variable and let y = (y 1,..., y q ) such that: y = Ax + b where A is a q p matrix and b is a q 1 column vector. Then, letting µ x and µ y be the mean vectors and Σ x and Σ y be the covariance matrices of x and y, respectively, we have: µ y = Aµ x + b, Σ y = AΣ x A Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 26 / 92

27 Multivariate distributions The multivariate Gaussian distribution is a generalization to two or more dimensions of the univariate Gaussian (or Normal) distribution. This is often characterized by its resemblance to the shape of a bell and this is why it is popularly referred as the bell curve. The Gaussian distribution is used extensively in both theoretical and applied statistics research. Although it is well known that real data rarely obey the dictates of the Gaussian distribution, this deception does provide us with a useful approximation to reality. The pdf of a univariate Gaussian random variable with mean µ x = E (x) and variance σx 2 = Var (x) is: f x (x) = ( ) 2πσx 2 ) 1/2 exp ( (x µ x) 2 < x < and we denote it as x N ( µ x, σ 2 x). 2σ 2 x Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 27 / 92

28 Multivariate distributions PDF of N(0,1) in blue, N(1,1) in green and N(0,2) in orange x Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 28 / 92

29 Multivariate distributions Generalizing the univariate Gaussian distribution, the pdf of a multivariate Gaussian random variable x = (x 1,..., x p ) with mean vector µ x = E (x) and covariance matrix Σ x = Cov (x) is given by: f x (x) = (2π) p/2 Σ x 1/2 exp ( (x µ x) Σ 1 ) x (x µ x ) 2 where < x j <, for j = 1,..., p. We denote it as x N p (µ x, Σ x ). The next slides show some examples of pdfs of bivariate Gaussian distributions. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 29 / 92

30 Multivariate distributions PDF of multivariate standard Gaussian x2 x Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 30 / 92

31 Multivariate distributions PDF of Gaussian with correlation x2 2 x Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 31 / 92

32 Multivariate distributions PDF of Gaussian with correlation x2 2 x Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 32 / 92

33 Multivariate distributions How is the N p (µ x, Σ x ) distribution related to the N p (0 p, I p ) distribution (the standard multivariate Gaussian distribution)? Through a linear transformation as follows. Let x N p (µ x, Σ x ) and y = Σ 1/2 x (x µ x ). Then, x N p (0 p, I p ). How can we create N p (µ x, Σ x ) variables on the basis of N p (0 p, I p ) variables? We use the inverse linear transformation: x = Σ 1/2 x y + µ x Additionally, it is of interest to know the distribution of a Gaussian variable after it has been linearly transformed. Let x N p (µ x, Σ x ), A a q p matrix and b a q 1 column vector. Then, y = Ax + b has a N q (Aµ x + b, AΣ x A ) distribution. Therefore, y has also a Gaussian distribution. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 33 / 92

34 Multivariate distributions The level curves or contours are the curves obtained by cutting the probability density function by parallel hyperplanes. In other words, the level curves are points with the same density value. In the multivariate Gaussian case, their equation is given by: (x µ x ) Σ 1 x (x µ x ) = c where c is a constant. Therefore, the level curves of multivariate Gaussian distributions are ellipsoids. The next two slides show the level curves for the Gaussian distributions considered in the previous plots with and without a sample of 100 points generated from these distributions. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 34 / 92

35 Multivariate distributions Levels curves for Gaussian with correlation 0 Levels curves for Gaussian with correlation.9 Levels curves for Gaussian with correlation Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 35 / 92

36 Multivariate distributions Levels curves for Gaussian with correlation 0 Levels curves for Gaussian with correlation.9 Levels curves for Gaussian with correlation Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 36 / 92

37 Multivariate distributions The level curves of the multivariate Gaussian distribution give us a notion of distance between points. Note that all the points in the level curve have the same density and form an ellipsoid. Therefore, it is reasonable to assume that all the points in a level curve are at the same distance from the center of the distribution. The implied distance is the Mahalanobis distance between x and µ x, given by: D M (x, µ x ) 2 = (x µ x ) Σ 1 x (x µ x ) If x N p (µ x, Σ x ), the squared Mahalanobis distance has a χ 2 p distribution, i.e., D M (x, µ) 2 χ 2 p. The Mahalanobis distance plays an important role in many problems such as outlier detection, classification, clustering and so on. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 37 / 92

38 Multivariate distributions Random sample Mahalanobis distances Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 38 / 92

39 Multivariate distributions It is useful to know more about the multivariate Gaussian distribution, since it is often a good approximation in many situations. It is often of interest to partition x into sub-variables. Therefore, if we partition x, its mean vector µ x and its covariance matrix Σ x as: ( ) ( ) x(1) µx(1) x = µ x x = (2) µ x(2) and, ( ) Σx(11) Σ Σ x = x(12) Σ x(21) Σ x(22) where x (1) and x (2) have dimensions q and p q, respectively, then, x (1) N q ( µx(1), Σ x(12) ), x(2) N p q ( µx(2), Σ x(22) ) and Cov ( x(1), x (2) ) = Σx(12). Moreover, x (1) and x (2) are independent if and only if Σ x(12) = 0 (q,p q), where 0 (q,p q) is a q (p q) matrix of zeros. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 39 / 92

40 Multivariate distributions If Σx(22) > 0, then the conditional distribution of x(1) given x (2) is Gaussian with mean: µ x(1) + Σ x(12) Σ 1 ( ) x(22) x(2) µ x(2) and covariance matrix: Σ x(11) Σ x(12) Σ 1 x(22) Σ x(21) If x (1) and x (2) are independent and distributed as N q ( µx(1), Σ x(12) ) and x(2) ( ) N p q µx(2), Σ x(22), respectively, then x = (x(1) (2)), x has the multivariate Gaussian distribution: (( ) ( )) µx(1) Σx(11) 0 N p, (q,p q) µ x(2) 0 (p q,q) Σ x(22) Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 40 / 92

41 Multivariate distributions The multivariate Gaussian distribution belongs to the large family of elliptical distributions which has recently gained a lot of attention in financial mathematics. The simplest case of elliptical distributions is the subclass of spherical distributions. We say that a vector variable x = (x 1,..., x p ) follows a spherical distribution if its density function only depends on the variable through x x. Therefore, the level curves of the distribution are spheres whose center is in the origin and the distribution is invariant under rotations. In other words, if we define y = Cx, where C is an orthogonal matrix, the density of the variable y is the same as that of x. This is only one of the possible ways to define spherical distributions. We can see spherical distributions as an extension of the standard multivariate Gaussian distribution N p (0 p, I p ). Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 41 / 92

42 Multivariate distributions The variable x = (x 1,..., x p ) follows a elliptical distribution if its density function only depends on x through (x m) V 1 (x m), where m is a p 1 column vector and V is a p p matrix (not necessarily the mean and the covariance matrix of x). The elliptical distributions verifies that their level curves are ellipsoids centered in m. The multivariate Gaussian distribution is the best known elliptical distribution. Indeed, elliptical distributions can be seen as an extension of the N p (µ x, Σ x ). Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 42 / 92

43 Multivariate distributions Let y N p (0 p, Σ) and u χ 2 ν be independent. The multivariate random variable: ν x = µ + u y has a multivariate Student s t distribution with parameters µ, Σ and ν. For ν > 2, the mean of the distribution is µ and the covariance matrix is v/ (v 2) Σ. The parameter ν is called the degrees of freedom parameter. The density function of a multivariate Student s t distribution is given by: f x (x) = Γ ( ) ν+p 2 (πν) p 2 Γ ( ) V x 1/2 ( 1 + (x m x ) V ν x 1 (x m x ) ) ν+p 2 2 The multivariate Student s t distribution belongs to the class of elliptical distributions. In particular, if Σ = I p, this distribution belongs to the class of spherical distributions. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 43 / 92

44 Multivariate distributions PDF of a Student't distribution with 5 df x2 x Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 44 / 92

45 Multivariate distributions Elliptical distributions share many properties with Gaussian distributions: marginal and conditional distributions are also elliptical, and the conditional means are a linear function of the determining variables. Nevertheless, the Gaussian distribution is the only one in the family to have the property whereby if the covariance matrix is diagonal, all the component variables are independent. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 45 / 92

46 Multivariate distributions A distribution is called heavy-tailed if it has higher probability density in its tail area compared with a Gaussian distribution with the same mean vector and covariance matrix. The multivariate Student s t distribution is an example of heavy-tailed distributions. Other examples of heavy-tailed distributions includes the multivariate generalized hyperbolic distribution, the multivariate Laplace distribution and the multivariate mixture of distributions. In particular, we briefly revise multivariate mixtures of distributions. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 46 / 92

47 Multivariate distributions Mixture modelling concerns modelling a statistical distribution by a mixture (or weighted sum) of different distributions. For many choices of component density functions, the mixture model can approximate any continuous density to arbitrary accuracy, provided that the number of component density functions is sufficiently large and the parameters of the model are chosen correctly. The density function of a multivariate random variable x = (x 1,..., x p ) that follows a mixture distribution is given by: f x (x) = G π g f x,g (x) g=1 where: π1,..., π G are weights such that G g=1 πg = 1; fx,1(x),..., f x,g (x) are multivariate pdf s. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 47 / 92

48 Multivariate distributions Note that the mixture distributions have an interesting interpretation in terms of heterogeneous populations. Assume a population where we have defined the multivariate random variable x and that can be subdivided more homogeneously into G groups. Then, π 1,..., π G can be seen as the proportion of elements in the groups 1,..., G, while f x,1 (x),..., f x,g (x) are multivariate pdf s associated with each population. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 48 / 92

49 Multivariate distributions PDF of a Mixture distribution x2 x Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 49 / 92

50 Multivariate distributions Levels curves for a mixture of Gaussian distributions Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 50 / 92

51 Multivariate distributions PDF of a Mixture distribution x2 x Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 51 / 92

52 Multivariate distributions Levels curves for a mixture of Gaussian distributions Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 52 / 92

53 Multivariate distributions One main problem in multivariate analysis is how to model dependence of the components of a multivariate random variable. We have seen several multivariate distributions that model this dependence. However, these models, except perhaps mixtures, are not flexible enough to model multivariate dependence. As seen before, Copulae represent an elegant concept of connecting marginals with joint cumulative distribution functions. Copulas are functions that join or couple multivariate distribution functions to their 1-dimensional marginal distribution functions. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 53 / 92

54 Multivariate distributions Let x = (x 1,..., x p ) be a multivariate random variable and let F xj, for j = 1,..., p, be the marginal distribution functions of the components of x. Using copulae, the marginal distribution functions can be separately modelled from their dependence structure and then coupled together to form the multivariate distribution F x. The formal definition of copula function is more complex than in the 2-dimensional case. However, the intuition is the same as for the 2-dimensional case, so we do not provide here its formal definition. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 54 / 92

55 Multivariate distributions Sklar s Theorem in p dimensions Let F x be a p-dimensional distribution function with marginal distribution functions F x1,..., F xp. Then, a p-dimensional copula C x exists such that for all x 1,..., x p R p : F x (x 1,..., x p ) = C x ( Fx1 (x 1 ),..., F xp (x p ) ) Moreover, if F x1,..., F xp are continuous then C x is unique. Conversely, if C x is a copula and F x1,..., F xp are distribution functions then F x defined above is a p-dimensional distribution function with marginals F x1,..., F xp. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 55 / 92

56 Multivariate distributions Let F z denote the univariate standard Gaussian distribution function and F x the p-dimensional Gaussian distribution with mean vector 0 p and covariance as well as correlation matrix Σ x. Then, the function: C Gauss x,σ x ( (u) = F x F 1 z (u 1 ),..., Fz 1 (u p ) ) is the p-dimensional Gaussian copula with correlation matrix Σ x, where u = (u 1,..., u p ) [0, 1] p. If Σ x I p, then, the corresponding Gaussian copula allows to generate joint symmetric dependence. However, it is not possible to model a tail dependence, e.g., joint extreme events have a zero probability. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 56 / 92

57 Multivariate distributions The function: 1/θ p x,θ (u) = exp ( log u j ) θ C GH j=1 is the p-dimensional Gumbel-Hougaard copula function where θ [1, ). Unlike the Gaussian copula, C GH x,θ can generate an upper tail dependence. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 57 / 92

58 Multivariate distributions PDF of a Copula distribution x1 4 2 x Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 58 / 92

59 Statistical inference In multivariate statistics, we observe the values of a multivariate random variable x = (x 1,..., x p ) and obtain a sample x i = (x i1,..., x ip ), for i = 1,..., n summarised in a data matrix X. For a given random sample, x 1,..., x n, the idea of statistical inference is to analyse the properties of the population random variable x. If we do not know the distribution of x, statistical inference can often be performed using some observable functions of the sample x 1,..., x n, i.e., statistics. Example of statistics are the sample mean and the sample covariance matrix. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 59 / 92

60 Statistical inference To get an idea of the relationship between a statistic and the corresponding population counterpart, one has to derive the sampling distribution of the statistic. Therefore, given a random sample, x 1,..., x n, of the population random variable x such that E [x] = µ x and Cov [x] = Σ x, then, the sample mean vector x and the sample covariance matrix S x verifies the following properties: 1 E [x] = µ x. 2 Cov [x] = 1 n Σx. 3 E [S x] = Σ x. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 60 / 92

61 Statistical inference Statistical inference often requires more than just the mean and/or the covariance of a statistic. We need the sampling distribution of the statistic to derive confidence intervals or to define rejection regions in hypothesis testing for a given significance level. For instance, in the Gaussian case, we have the following result. Theorem Let x 1,..., x n be i.i.d. with x i N (µ x, Σ x ). Then, x N ( µ x, 1 n Σ x). The central limit theorem shows than even if the parent distribution is not Gaussian, when the sample size n is large, the sample mean vector x has an approximate Gaussian distribution. Central Limit Theorem (CLT) Let x 1,..., x n be i.i.d. with x i (µ x, Σ x ). Then, the distribution of n (x µ x ) is asymptotically N (0 p, Σ x ), i.e., n (x µx ) d N (0 p, Σ x ) as n Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 61 / 92

62 Statistical inference The next two slides show multivariate kernel density estimates of a sample of 2000 sample mean vector corresponding to 2000 samples of a certain bivariate random variable. The first slide corresponds to the case of n = 5. The second slide corresponds to the case of n = 100. It is easy to see that the second estimate appears to be closer to the bivariate Gaussian distribution than the first one. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 62 / 92

63 Statistical inference n= x1 1 x Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 63 / 92

64 Statistical inference n= x1 x Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 64 / 92

65 Statistical inference If we assume that we know the distribution of the multivariate random variable x, then, the mail goal of statistical inference is to estimate the parameters of this distribution. Then, let θ = (θ 1,..., θ r ) be the vector of parameters of a certain distribution with density function f ( θ). The aim is to estimate the vector θ from a i.i.d. sample x 1,..., x n from x. For that, the most important method to carry out this task is the maximum likelihood estimation (MLE) method. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 65 / 92

66 Statistical inference Let x 1,..., x n be an i.i.d. sample of x. Then, the joint pdf of x 1,..., x n is given by: n f (x 1,..., x n θ) = f (x i θ) Then, note that the sample is known (X, the data matrix) but θ is unknown. In MLE, it is considered that θ is a variable and X is fixed, leading to the likelihood function: n l (θ X ) = f (x i θ) where x i = (x i1,..., x ip ). i=1 i=1 The likelihood function can be seen as the pdf of θ X. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 66 / 92

67 Maximum likelihood estimation The maximum likelihood estimate (MLE) of θ, denoted by θ, is the value of θ that maximizes l (θ X ), i.e.: θ = arg maxl (θ X ) θ In other words, the MLE, θ, is the value of θ that maximizes the probability of obtaining the sample under study. Often it is easier to maximize the log of the likelihood function, named the log-likelihood function or support function: L (θ X ) = log l (θ X ) which is equivalent since the logarithm is a monotone one-to-one function. Hence, θ = arg maxl (θ X ) = arg maxl (θ X ) θ θ Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 67 / 92

68 Maximum likelihood estimation Usually, the maximisation process can not be performed analytically. Therefore, the maximisation process involves nonlinear optimization techniques. In this case, given a data matrix X and the likelihood function, numerical methods will be used to determine the value of θ maximising L (θ X ) or l (θ X ). These numerical methods are typically based on Newton-Raphson techniques. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 68 / 92

69 Maximum likelihood estimation Let x 1,..., x n be a simple random sample from x N (µ x, Σ x ). Then, the joint density function is: f (x 1,..., x n µ x, Σ x ) = n { (2π) p/2 Σ x 1/2 exp ( (x i µ x ) Σ 1 )} x (x i µ x ) 2 i=1 Then, the support function is given by: L (µ x, Σ x X ) = np 2 log 2π n 2 log Σ x 1 2 n i=1 (x i µ x ) Σ 1 x (x i µ x ) Next, note that we can write: n i=1 (x i µ x ) Σ 1 x (x i µ x ) = Tr [ Σ 1 x ( n )] (x i µ x ) (x i µ x ) i=1 Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 69 / 92

70 Maximum likelihood estimation On the other hand, adding and subtracting the sample mean vector x in (x i µ x ) leads to: n (x i µ x ) (x i µ x ) = i=1 = n (x i x + x µ x ) (x i x + x µ x ) = i=1 n (x i x) (x i x) + n (x µ x ) (x µ x ) i=1 because the terms n i=1 (x i x) (x µ x ) and n i=1 (x µ x) (x i x) are both matrices of zeros. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 70 / 92

71 Maximum likelihood estimation Consequently: Tr = Tr [ [ Σ 1 x Σ 1 x n i=1 (x i µ x ) Σ 1 x (x i µ x ) = ( n )] (x i x) (x i x) + n (x µ x ) (x µ x ) = i=1 ( n )] (x i x) (x i x) i=1 + n (x µ x ) Σ 1 x (x µ x ) Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 71 / 92

72 Maximum likelihood estimation Therefore, the support function can be written as: 1 2 ( Tr [ Σ 1 x L (µ x, Σ x X ) = np 2 log 2π n 2 log Σ x ( n i=1 (x i x) (x i x) )] + n (x µ x ) Σ 1 x (x µ x ) Now, L (µ x, Σ x X ) only depends on µ x in the last term and that this is maximized if (x µ x ) Σ 1 x (x µ x ) = 0. Therefore, the MLE of µ x is µ = x. ) Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 72 / 92

73 Maximum likelihood estimation It remains to maximize: L (Σ x X, µ x = x) = np 2 log 2π n 2 log Σ x 1 2 Tr [ Σ 1 x ( n )] (x i x) (x i x) For that, we need a result from the matrix algebra: Given a p p symmetric positive definite matrix B and a scalar b > 0, it follows that: b log Σ x 1 2 Tr ( Σ 1 x B ) b log B + pb log (2b) pb Then, taking b = n/2 and B = n i=1 (x i x) (x i x), shows that the MLE of Σ x is: Σ x = 1 n (x i x) (x i x) n i=1 Note that the MLE of Σ x is not the sample covariance matrix but a re-scaled version of it. i=1 Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 73 / 92

74 Maximum likelihood estimation The next Theorem gives the asymptotic sampling distribution of the MLE, which turns out to be Gaussian. Theorem Suppose that the sample x 1,..., x n is i.i.d. If θ is the MLE for θ R r, i.e., θ = arg maxl (θ X ), then under some regularity conditions, as n : θ n ( θ θ ) d N ( 0 r, F 1) where F denotes the Fisher information matrix given by: F = 1 [ ] 2 n E θ θ L (θ X ) As a consequence of this Theorem, we see that under regularity conditions the MLE is asymptotically unbiased, efficient (minimum variance) and Gaussian distributed. Also it is a consistent estimator of θ. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 74 / 92

75 Hypothesis testing We turn now our interest towards hypothesis testing issues. In particular, we will go over a general methodology to construct tests called the likelihood ratio method and we will apply them to the case of Gaussian populations. Then, we assume a r-dimensional vector parameter, θ, that takes values in Ω R r. We want to test the hypothesis H 0 that the unknown parameter θ belongs to some subspace of R r. This subspace is called the null set and will be denoted by Ω 0 R r. Consequently, we want to test the hypothesis: versus the alternative hypothesis: H 0 : θ Ω 0 H 1 : θ Ω which suppose that θ is not restricted to Ω 0. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 75 / 92

76 Hypothesis testing For example, consider a multivariate Gaussian N (µ x, Σ x ). To test if µ x equals a certain fixed value of µ 0, we construct the test problem: H 0 : µ x = µ 0 H 1 : no constraints on µ x Then, in this example we have Ω 0 = {µ 0 } and Ω = R p. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 76 / 92

77 Hypothesis testing Define l 0 = max l (θ X ) and l = maxl (θ X ), the values of the maximized θ Ω 0 θ Ω likelihood under H 0 and H 1, respectively. Consider the likelihood ratio (LR) given by: LR = l 0 l By construction 0 LR 1, and one tends to favour H 0 if the LR is high ( close to 1) and H 1 if the LR is low ( not close to 1). The likelihood ratio test (LRT) tell us when exactly to favour H 0 over H 1. This is given by: λ = 2 ln LR = 2 (ln l 0 ln l ) The LRT λ is asymptotically distributed like a χ 2 distribution with the number of degrees of freedom equal to the difference of the dimension between the spaces Ω and Ω 0. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 77 / 92

78 Hypothesis testing Given a sample from a population N (µ x, Σ x ), we want to test the hypothesis: against the alternative H 0 : µ x = µ 0 H 1 : µ x µ 0 It is possible to show that the likelihood ratio test statistic is given by, Σ 0 λ = n log Σ x where, Σ 0 = 1 n n (x i µ 0 ) (x i µ 0 ) i=1 which has an asymptotic χ 2 distribution with p degrees of freedom. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 78 / 92

79 Illustrative example (I) Consider the daily log-returns (in percentages) of four major European stock indices: Germany (DAX), Switzerland (SMI), France (CAC) and UK (FTSE), from 1991 to We want to test the null hypothesis that the mean vector of returns is zero (assuming Gaussianity). The estimated mean is given by: x = (0.065, 0.081, 0.043, 0.043) The covariance matrix under H 0 is given by: Σ 0 = Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 79 / 92

80 Illustrative example (I) The covariance matrix under H 1 is given by: Σ x = The value of the statistic is λ = with associated p-value Thus, we reject H 0 at the 5% significant level but we cannot reject H 0 at the 1% significant level. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 80 / 92

81 Hypothesis testing Given a sample of a population N (µ x, Σ x ), we want to test the hypothesis: against the alternative H 0 : Σ x = Σ 0 H 1 : Σ x Σ 0 It is possible to show that the likelihood ratio test statistic is given by, λ = n log Σ ( 0 + ntr Σ Σ 1 0 ) Σ np which has an asymptotic χ 2 distribution with p (p + 1) /2 degrees of freedom. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 81 / 92

82 Hypothesis testing It is also of interest to know whether Σ x is diagonal, in which case the univariate variables are independent. In this case, we gain nothing from analyzing them jointly since they have no information in common. Then: against the alternative H 0 : Σ x diagonal H 1 : Σ x unrestricted It is possible to show that the likelihood ratio test statistic is given by, λ = n log R x which has an asymptotic χ 2 distribution with p (p 1) /2 degrees of freedom. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 82 / 92

83 Illustrative example (I) Consider again the daily log-returns (in percentages) of four major European stock indices. We test the null hypothesis of independency (assuming Gaussianity). The estimated correlation matrix under H 0 is given by: R = The value of the statistic is λ = with associated p-value 0. Thus, we reject H 0 at the usual significance levels. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 83 / 92

84 Hypothesis testing Assume that we have observed a sample of size n of a p-dimensional variable x = (x 1,..., x p ) that can be split into G groups so that there are n 1 observations of group 1, and so on. Our goal here is to check whether the means of the G groups are equal or not assuming Gaussianity and that the covariance matrix Σ x is the same for all the groups. Then, the hypothesis to be tested is: and the alternative hypothesis is: H 0 : µ 1 = = µ G = µ x H 1 : not all the µ g are equal This problem is known as the multivariate analysis of variance. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 84 / 92

85 Hypothesis testing The likelihood ratio test method leads to the statistic: Σ x λ = n log S W where Σ x is the MLE of Σ x under Gaussianity, and S W = W /n where: W = n G g (x ig x g ) (x ig x g ) g=1 i=1 where x ig is the i-th observation in group g and x g is the sample mean vector of the observations in group g. W is usually called the within groups variability matrix or matrix of deviations with respect to the means of each group. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 85 / 92

86 Hypothesis testing The statistic λ has an asymptotic χ 2 p(g 1) distribution. However, this approximation can be improved for small sample sizes. For instance, the statistic: Σ x λ 0 = m log S W asymptotically follows a χ 2 p(g 1) distribution, where m = (n 1) (p + G) /2. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 86 / 92

87 Hypothesis testing This test can be derived in an alternative way. Let: T = n Σ x = n G g (x ig x) (x ig x) g=1 i=1 be the total variability of the data, which measures the deviations with respect to a common mean. The matrix T can be decomposed as the sum of two matrices. The first one is the matrix W which has been defined previously. The second one measures the between groups variability, explained by the differences between means, and that we will denote as B: Therefore, we can write: B = G n g (x g x) (x g x) g=1 T (Total variability) = B (Explained variability) + W (Residual variability) Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 87 / 92

88 Hypothesis testing In order to test whether the means are equal we can compare the size of the matrices T and B. One idea is to consider that the measurement of their size is their determinant. Then, we can propose a test based on the ratio T / W. For moderate sizes, the test is similar to the likelihood ratio test that uses the statistic λ 0, that can also be written as: Σ x T λ 0 = m log = m log S W W Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 88 / 92

89 Illustrative example (II) We consider the Iris dataset consisting in four univariate variables measured on 150 flowers of 3 different species (setosa, versicolor and virginica). There are 50 flowers of each specie: x1: Length of the sepal (in mm.). x2: Width of the sepal (in mm.). x3: Length of the petal (in mm.). x4: Width of the petal (in mm.). The next slide shows the scatterplot matrix of the dataset. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 89 / 92

90 Illustrative example (II) Iris dataset Sepal.Length Sepal.Width Petal.Length Petal.Width Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 90 / 92

91 Illustrative example (II) We test the equality of means for the 3 groups of the Iris dataset. The means of the 3 groups are given by: x 1 = x 2 = x 2 = The value of the statistic λ is with associated p-value 0. Thus, we reject H 0. On the other hand, the value of the statistic λ 0 is with associated p-value 0. Thus, we also reject H 0 with this statistic. Consequently, we reject that the three subset of observations have the same means. Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 91 / 92

92 Chapter outline 1 Introduction. 2 Basic concepts. 3 Multivariate distributions. 4 Statistical inference. 5 Hypothesis testing. We are ready now for: Chapter 3: Principal components Pedro Galeano (Course 2016/2017) Multivariate Statistics - Chapter 3 Master in Mathematical Engineering 92 / 92

A Probability Review

A Probability Review A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in

More information

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ). .8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

More information

Applied Multivariate and Longitudinal Data Analysis

Applied Multivariate and Longitudinal Data Analysis Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 In this chapter we will discuss inference

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 4: Factor analysis Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2017/2018 Master in Mathematical Engineering Pedro

More information

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna

More information

Multiple Random Variables

Multiple Random Variables Multiple Random Variables This Version: July 30, 2015 Multiple Random Variables 2 Now we consider models with more than one r.v. These are called multivariate models For instance: height and weight An

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 6: Cluster Analysis Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2017/2018 Master in Mathematical Engineering

More information

Multivariate Analysis

Multivariate Analysis Multivariate Analysis Chapter 5: Cluster analysis Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2015/2016 Master in Business Administration and

More information

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace

More information

MULTIVARIATE DISTRIBUTIONS

MULTIVARIATE DISTRIBUTIONS Chapter 9 MULTIVARIATE DISTRIBUTIONS John Wishart (1898-1956) British statistician. Wishart was an assistant to Pearson at University College and to Fisher at Rothamsted. In 1928 he derived the distribution

More information

Notes on the Multivariate Normal and Related Topics

Notes on the Multivariate Normal and Related Topics Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA Department

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Multivariate Random Variable

Multivariate Random Variable Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011 Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011 Outline Ordinary Least Squares (OLS) Regression Generalized Linear Models

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

The Delta Method and Applications

The Delta Method and Applications Chapter 5 The Delta Method and Applications 5.1 Local linear approximations Suppose that a particular random sequence converges in distribution to a particular constant. The idea of using a first-order

More information

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians Engineering Part IIB: Module F Statistical Pattern Processing University of Cambridge Engineering Part IIB Module F: Statistical Pattern Processing Handout : Multivariate Gaussians. Generative Model Decision

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

Multivariate Distribution Models

Multivariate Distribution Models Multivariate Distribution Models Model Description While the probability distribution for an individual random variable is called marginal, the probability distribution for multiple random variables is

More information

Notes on Random Vectors and Multivariate Normal

Notes on Random Vectors and Multivariate Normal MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Lecture 25: Review. Statistics 104. April 23, Colin Rundel Lecture 25: Review Statistics 104 Colin Rundel April 23, 2012 Joint CDF F (x, y) = P [X x, Y y] = P [(X, Y ) lies south-west of the point (x, y)] Y (x,y) X Statistics 104 (Colin Rundel) Lecture 25 April

More information

Introduction to Probability and Stocastic Processes - Part I

Introduction to Probability and Stocastic Processes - Part I Introduction to Probability and Stocastic Processes - Part I Lecture 2 Henrik Vie Christensen vie@control.auc.dk Department of Control Engineering Institute of Electronic Systems Aalborg University Denmark

More information

A simple graphical method to explore tail-dependence in stock-return pairs

A simple graphical method to explore tail-dependence in stock-return pairs A simple graphical method to explore tail-dependence in stock-return pairs Klaus Abberger, University of Konstanz, Germany Abstract: For a bivariate data set the dependence structure can not only be measured

More information

Introduction to Maximum Likelihood Estimation

Introduction to Maximum Likelihood Estimation Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 3: Principal Component Analysis Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2017/2018 Master in Mathematical

More information

2 Functions of random variables

2 Functions of random variables 2 Functions of random variables A basic statistical model for sample data is a collection of random variables X 1,..., X n. The data are summarised in terms of certain sample statistics, calculated as

More information

Lecture 13: Data Modelling and Distributions. Intelligent Data Analysis and Probabilistic Inference Lecture 13 Slide No 1

Lecture 13: Data Modelling and Distributions. Intelligent Data Analysis and Probabilistic Inference Lecture 13 Slide No 1 Lecture 13: Data Modelling and Distributions Intelligent Data Analysis and Probabilistic Inference Lecture 13 Slide No 1 Why data distributions? It is a well established fact that many naturally occurring

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

Financial Econometrics and Volatility Models Copulas

Financial Econometrics and Volatility Models Copulas Financial Econometrics and Volatility Models Copulas Eric Zivot Updated: May 10, 2010 Reading MFTS, chapter 19 FMUND, chapters 6 and 7 Introduction Capturing co-movement between financial asset returns

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

Introduction to Normal Distribution

Introduction to Normal Distribution Introduction to Normal Distribution Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 17-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Introduction

More information

01 Probability Theory and Statistics Review

01 Probability Theory and Statistics Review NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement

More information

component risk analysis

component risk analysis 273: Urban Systems Modeling Lec. 3 component risk analysis instructor: Matteo Pozzi 273: Urban Systems Modeling Lec. 3 component reliability outline risk analysis for components uncertain demand and uncertain

More information

4 Statistics of Normally Distributed Data

4 Statistics of Normally Distributed Data 4 Statistics of Normally Distributed Data 4.1 One Sample a The Three Basic Questions of Inferential Statistics. Inferential statistics form the bridge between the probability models that structure our

More information

Elliptically Contoured Distributions

Elliptically Contoured Distributions Elliptically Contoured Distributions Recall: if X N p µ, Σ), then { 1 f X x) = exp 1 } det πσ x µ) Σ 1 x µ) So f X x) depends on x only through x µ) Σ 1 x µ), and is therefore constant on the ellipsoidal

More information

Copulas. MOU Lili. December, 2014

Copulas. MOU Lili. December, 2014 Copulas MOU Lili December, 2014 Outline Preliminary Introduction Formal Definition Copula Functions Estimating the Parameters Example Conclusion and Discussion Preliminary MOU Lili SEKE Team 3/30 Probability

More information

Section 8.1. Vector Notation

Section 8.1. Vector Notation Section 8.1 Vector Notation Definition 8.1 Random Vector A random vector is a column vector X = [ X 1 ]. X n Each Xi is a random variable. Definition 8.2 Vector Sample Value A sample value of a random

More information

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO LECTURE NOTES FYS 4550/FYS9550 - EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I PROBABILITY AND STATISTICS A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO Before embarking on the concept

More information

You can compute the maximum likelihood estimate for the correlation

You can compute the maximum likelihood estimate for the correlation Stat 50 Solutions Comments on Assignment Spring 005. (a) _ 37.6 X = 6.5 5.8 97.84 Σ = 9.70 4.9 9.70 75.05 7.80 4.9 7.80 4.96 (b) 08.7 0 S = Σ = 03 9 6.58 03 305.6 30.89 6.58 30.89 5.5 (c) You can compute

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

Probabilities & Statistics Revision

Probabilities & Statistics Revision Probabilities & Statistics Revision Christopher Ting Christopher Ting http://www.mysmu.edu/faculty/christophert/ : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 January 6, 2017 Christopher Ting QF

More information

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction Estimation Tasks Short Course on Image Quality Matthew A. Kupinski Introduction Section 13.3 in B&M Keep in mind the similarities between estimation and classification Image-quality is a statistical concept

More information

1: PROBABILITY REVIEW

1: PROBABILITY REVIEW 1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians University of Cambridge Engineering Part IIB Module 4F: Statistical Pattern Processing Handout 2: Multivariate Gaussians.2.5..5 8 6 4 2 2 4 6 8 Mark Gales mjfg@eng.cam.ac.uk Michaelmas 2 2 Engineering

More information

3. Probability and Statistics

3. Probability and Statistics FE661 - Statistical Methods for Financial Engineering 3. Probability and Statistics Jitkomut Songsiri definitions, probability measures conditional expectations correlation and covariance some important

More information

T 2 Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem

T 2 Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem T Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem Toshiki aito a, Tamae Kawasaki b and Takashi Seo b a Department of Applied Mathematics, Graduate School

More information

The Instability of Correlations: Measurement and the Implications for Market Risk

The Instability of Correlations: Measurement and the Implications for Market Risk The Instability of Correlations: Measurement and the Implications for Market Risk Prof. Massimo Guidolin 20254 Advanced Quantitative Methods for Asset Pricing and Structuring Winter/Spring 2018 Threshold

More information

STAT 4385 Topic 01: Introduction & Review

STAT 4385 Topic 01: Introduction & Review STAT 4385 Topic 01: Introduction & Review Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 Outline Welcome What is Regression Analysis? Basics

More information

Heteroskedasticity; Step Changes; VARMA models; Likelihood ratio test statistic; Cusum statistic.

Heteroskedasticity; Step Changes; VARMA models; Likelihood ratio test statistic; Cusum statistic. 47 3!,57 Statistics and Econometrics Series 5 Febrary 24 Departamento de Estadística y Econometría Universidad Carlos III de Madrid Calle Madrid, 126 2893 Getafe (Spain) Fax (34) 91 624-98-49 VARIANCE

More information

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline.

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline. MFM Practitioner Module: Risk & Asset Allocation September 11, 2013 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

1. Density and properties Brief outline 2. Sampling from multivariate normal and MLE 3. Sampling distribution and large sample behavior of X and S 4.

1. Density and properties Brief outline 2. Sampling from multivariate normal and MLE 3. Sampling distribution and large sample behavior of X and S 4. Multivariate normal distribution Reading: AMSA: pages 149-200 Multivariate Analysis, Spring 2016 Institute of Statistics, National Chiao Tung University March 1, 2016 1. Density and properties Brief outline

More information

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation Variations ECE 6540, Lecture 10 Last Time BLUE (Best Linear Unbiased Estimator) Formulation Advantages Disadvantages 2 The BLUE A simplification Assume the estimator is a linear system For a single parameter

More information

Multivariate Non-Normally Distributed Random Variables

Multivariate Non-Normally Distributed Random Variables Multivariate Non-Normally Distributed Random Variables An Introduction to the Copula Approach Workgroup seminar on climate dynamics Meteorological Institute at the University of Bonn 18 January 2008, Bonn

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2 Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate

More information

Statistics and Data Analysis

Statistics and Data Analysis Statistics and Data Analysis The Crash Course Physics 226, Fall 2013 "There are three kinds of lies: lies, damned lies, and statistics. Mark Twain, allegedly after Benjamin Disraeli Statistics and Data

More information

conditional cdf, conditional pdf, total probability theorem?

conditional cdf, conditional pdf, total probability theorem? 6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random

More information

Lecture 5: LDA and Logistic Regression

Lecture 5: LDA and Logistic Regression Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Exercises Chapter 4 Statistical Hypothesis Testing

Exercises Chapter 4 Statistical Hypothesis Testing Exercises Chapter 4 Statistical Hypothesis Testing Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans December 5, 013 Christophe Hurlin (University of Orléans) Advanced Econometrics

More information

Independent Component (IC) Models: New Extensions of the Multinormal Model

Independent Component (IC) Models: New Extensions of the Multinormal Model Independent Component (IC) Models: New Extensions of the Multinormal Model Davy Paindaveine (joint with Klaus Nordhausen, Hannu Oja, and Sara Taskinen) School of Public Health, ULB, April 2008 My research

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Bayesian Decision Theory

Bayesian Decision Theory Bayesian Decision Theory Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 1 / 46 Bayesian

More information

Correlation analysis. Contents

Correlation analysis. Contents Correlation analysis Contents 1 Correlation analysis 2 1.1 Distribution function and independence of random variables.......... 2 1.2 Measures of statistical links between two random variables...........

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Addressing ourliers 1 Addressing ourliers 2 Outliers in Multivariate samples (1) For

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

STA 2201/442 Assignment 2

STA 2201/442 Assignment 2 STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution

More information

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis. Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

Lecture 11. Probability Theory: an Overveiw

Lecture 11. Probability Theory: an Overveiw Math 408 - Mathematical Statistics Lecture 11. Probability Theory: an Overveiw February 11, 2013 Konstantin Zuev (USC) Math 408, Lecture 11 February 11, 2013 1 / 24 The starting point in developing the

More information

MULTIVARIATE HOMEWORK #5

MULTIVARIATE HOMEWORK #5 MULTIVARIATE HOMEWORK #5 Fisher s dataset on differentiating species of Iris based on measurements on four morphological characters (i.e. sepal length, sepal width, petal length, and petal width) was subjected

More information

An Introduction to Multivariate Statistical Analysis

An Introduction to Multivariate Statistical Analysis An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents

More information

Lecture 11. Multivariate Normal theory

Lecture 11. Multivariate Normal theory 10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Multivariate random variables

Multivariate random variables DS-GA 002 Lecture notes 3 Fall 206 Introduction Multivariate random variables Probabilistic models usually include multiple uncertain numerical quantities. In this section we develop tools to characterize

More information

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout :. The Multivariate Gaussian & Decision Boundaries..15.1.5 1 8 6 6 8 1 Mark Gales mjfg@eng.cam.ac.uk Lent

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

Multivariate Distributions

Multivariate Distributions Copyright Cosma Rohilla Shalizi; do not distribute without permission updates at http://www.stat.cmu.edu/~cshalizi/adafaepov/ Appendix E Multivariate Distributions E.1 Review of Definitions Let s review

More information

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

BASICS OF PROBABILITY

BASICS OF PROBABILITY October 10, 2018 BASICS OF PROBABILITY Randomness, sample space and probability Probability is concerned with random experiments. That is, an experiment, the outcome of which cannot be predicted with certainty,

More information

Lecture Note 1: Probability Theory and Statistics

Lecture Note 1: Probability Theory and Statistics Univ. of Michigan - NAME 568/EECS 568/ROB 530 Winter 2018 Lecture Note 1: Probability Theory and Statistics Lecturer: Maani Ghaffari Jadidi Date: April 6, 2018 For this and all future notes, if you would

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

COM336: Neural Computing

COM336: Neural Computing COM336: Neural Computing http://www.dcs.shef.ac.uk/ sjr/com336/ Lecture 2: Density Estimation Steve Renals Department of Computer Science University of Sheffield Sheffield S1 4DP UK email: s.renals@dcs.shef.ac.uk

More information

Classification Methods II: Linear and Quadratic Discrimminant Analysis

Classification Methods II: Linear and Quadratic Discrimminant Analysis Classification Methods II: Linear and Quadratic Discrimminant Analysis Rebecca C. Steorts, Duke University STA 325, Chapter 4 ISL Agenda Linear Discrimminant Analysis (LDA) Classification Recall that linear

More information

Quick Tour of Basic Probability Theory and Linear Algebra

Quick Tour of Basic Probability Theory and Linear Algebra Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions

More information

1. Introduction to Multivariate Analysis

1. Introduction to Multivariate Analysis 1. Introduction to Multivariate Analysis Isabel M. Rodrigues 1 / 44 1.1 Overview of multivariate methods and main objectives. WHY MULTIVARIATE ANALYSIS? Multivariate statistical analysis is concerned with

More information

TAMS39 Lecture 2 Multivariate normal distribution

TAMS39 Lecture 2 Multivariate normal distribution TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution

More information