Stat 206: the Multivariate Normal distribution

Size: px
Start display at page:

Download "Stat 206: the Multivariate Normal distribution"

Transcription

1 Stat 6: the Multivariate Normal distribution James Johndrow (adapted from Iain Johnstone s notes) Introduction The multivariate normal distribution plays a central role in multivariate statistics for several reasons 1. It is considerably more mathematically tractable than other multivariate distributions.. The multivariate generalization of the central limit theorem 1 says 1 see the notes from lecture that properly centered and scaled sums of random vectors are asymptotically multivariate normal. Since many scientifically meaningful multivariate statistics are sums of vectors, this means that many statistics have approximately a multivariate normal distribution in large samples. 3. Much more general sampling models can be constructed using multivariate normal distributions as building blocks (e.g. mixtures). This is where we will begin to depart from the book a bit, and this departure will persist throughout much of the course. The book devotes a lot of space to multivariate tests that generally rely on the assumption of joint normality, and often additional assumptions, to derive the distribution of the test statistic. For a variety of reasons, this assumption is practically stronger than the commonly used assumption of Gaussian residuals in the linear model, and somewhat harder to work around. As a result, tests that require joint normality often aren t very useful, and we won t spend a lot of time on them. On the other hand, the book devotes relatively little attention to the virtues of the multivariate normal distribution as the asymptotic distribution of many statistics of interest and the ease with which more complicated sampling models can be constructed from it. We will devote somewhat more time to these issues, the latter mainly in the context of clustering. Definition The p-dimensional multivariate normal distribution with mean µ and covariance Σ has density f(x) = πσ 1/ exp( (µ x) 1 Σ 1 (µ x)/), where x, µ P R p and Σ is a symmetric and positive-definite matrix. When a random vector X has this density, we write X N(µ, Σ).

2 stat 6: the multivariate normal distribution Since by definition, densities integrate to one over their support, we can ascertain that ż xpr p exp( (µ x) 1 Σ 1 (µ x))dx = Σ/π 1/, the multivariate generalization of the Gaussian integral. Some authors define the multivariate normal by its moment-generating function: E[e t1x ] = exp(µ 1 t + t 1 Σt/). The multivariate normal distribution is, in a fundamental sense, the distribution of affine transformations of independent Gaussians. Theorem 1. Let Z 1,..., Z p be iid N(, 1), let Σ = UΛU 1 be a positive-definite, symmetric matrix, and let µ P R p be a real vector. Then if Z = (Z 1,..., Z p ) 1 and Σ 1/ = UΛ 1/ U 1 is the square root of Σ, we have Proof. By independence, X = µ + Σ 1/ Z N(µ, Σ). f(z 1,..., z p ) = ź j f j (z j ) = (π) p/ exp ( z 1 z/ ). Make the transformation x = µ + Σ 1/ z, so that z = Σ 1/ (x µ). Since Bz Bx = B Σ 1/ Bx (x µ) = Σ 1/, the Jacobian determinant is Σ 1/, giving ( ) f(x) = Σ 1/ (π) p/ exp (Σ 1/ (x µ)) 1 (Σ 1/ (x µ))/ = πσ 1/ exp ( (x µ) 1 Σ 1 (x µ)/ ). Using this and the following result on the mean and covariance of linear functions, we can obtain the mean and covariance of the multivariate normal. Theorem. Let X be a random p-vector with mean µ and covariance Σ, and let A be a q ˆ p matrix and a a p-vector. Then E[a + AX] = a + Aµ cov(a + AX) = AΣA 1.

3 stat 6: the multivariate normal distribution 3 The proof is by direct calculation, and is omitted as the task is rather dull. So this gives us Corollary 1. If X N(µ, Σ) then E[X] = µ and cov(x) = Σ. Feel free to grind through it if interested, it is a good test of facility with matrix operations. Proof. By Theorem 1, X = µ + Σ 1/ Z for Z N(, I). By Theorem, E[X] = µ + Σ 1/ E[Z] = µ cov(x) = Σ 1/ I(Σ 1/ ) 1 = Σ 1/ Σ 1/ = Σ. This implies something very strong about the multivariate normal distribution that is not typical of multivariate distributions in general. Theorem 3. There exists exactly one multivariate normal random variable X with mean µ and covariance Σ. Proof. Theorem 1 combined with the fact that two random variables with the same density are identical. Thus, analogous to the univariate case, a multivariate normal distribution is completely characterized by its first two moments. We also have Corollary. If X N(µ, Σ) and A is full-rank, then a + AX N(a + Aµ, AΣA 1 ). Proof. Analogous to the proof of Theorem 1. We conclude this section with a warning. It is possible to have two dependent random vectors X 1 and X, each of which is marginally multivariate normal, but which are not jointly multivariate normal. We will revisit this later in the notes. Geometry of the normal density and Gaussian dependence structure Another reason for the use of Mahalanobis topology in multivariate statistics is that it describes the shape of the multivariate normal. Let x, y P R p be two points that have identical values of the multivariate normal density f with mean µ and covariance Σ. Then f(x) = πσ 1/ exp( (x µ) 1 Σ 1 (x µ)/) = πσ 1/ exp( (y µ) 1 Σ 1 (y µ)/) = f(y)

4 stat 6: the multivariate normal distribution 4 so exp( (x µ) 1 Σ 1 (x µ)/) = exp( (x µ) 1 Σ 1 (x µ)/) (x µ) 1 Σ 1 (x µ) = (y µ) 1 Σ 1 (y µ). Therefore the sets of all points having identical value of the density 3 are given by 3 The level sets of f X c = tx : (x µ) 1 Σ 1 (x µ) = c u = tx : x µ Σ = cu for the Mahalanobis distance x µ Σ, the equation of an ellipsoid centered at µ. The following allows us to compute the probability associated with the interior of any level set Remark 1. If X N(µ, Σ), then (X µ) 1 Σ 1 (X µ) χ p, a χ random variable with p degrees of freedom. Proof. Σ 1/ (X µ) N(, I), so (X µ) 1 Σ 1 (X µ) = ř j Z j for Z j iid N(, 1). Therefore, the region (x µ) 1 Σ 1 (x µ) ď F χ p (1 α), where F χ p is the CDF of χ p, is an elliptical set containing 1 α of the probability assigned by the multivariate normal distribution. What does the multivariate normal density look like? Let s make some plots for the case of p = of a bivariate normal density with mean and covariance for ρ =,.5,.5,.95. Σ = ( 1 ρ ρ 1 ) y z x These plots show contours of equal probability, with the value of the density proportional to the color of the contour. In other words, Figure 1: Level sets of the bivariate normal distribution with covariance matrix having diagonal elements 1 and off-diagonal elements as indicated in subfigure title.

5 stat 6: the multivariate normal distribution 5 we are visualizing the level sets of f and the value of f on that level set. As you can see, as the covariance/correlation increases, the major axis of the level sets rotates toward the line y = x and the shape becomes elongated. 4 This fits with our intuitive notion of correlation. When correlation is high between random variables X and Y, they tend to take similar values. Negative correlation also induces the expected shape: the density concentrates around the major axis of the level set as the correlation grows more negative, and the orientation of the major axis rotates toward the line y = x. That is, negatively correlated, jointly normal random variables tend to have values that are similar in magnitude, but opposite in sign. 4 that is, the eccentricity of the ellipse increases. y x z There is an interesting connection between the level sets of f and the eigenvalues of Σ. 5 Remark. Suppose X N(µ, Σ), λ 1,..., λ p are the eigenvalues of Σ in descending order, and e 1,..., e p are the associated eigenvectors. Then the c -level set X c of the density f of the random variable X is an ellipsoid centered at µ with axes c a λ j e j. Figure : Level sets of the bivariate normal density for negative correlations. 5 Stated without proof, but see the book for examples. Think about this remark for a moment in the context of the previous two figures. When ρ =, the level sets of f are actually circles, so the lengths of the two axes are the same. As ρ Ñ 1, the major axis elongates and rotates toward the line y = x, whereas as ρ Ñ 1, the rotation is toward the line y = x. You may speculate that in the limit as ρ = 1 or 1, the density is zero except at points where y = x. This intuition is essentially correct, and in fact the density is concentrating on a line as ρ increases. Now, let s interpret this in light of Remark with an example: Example 1. The matrix ( )

6 stat 6: the multivariate normal distribution 6 has linearly dependent columns, so it cannot have two nonzero eigenvalues. It s easy to figure out what the eigenvectors and eigenvalues are for this matrix ( ) ( 1 1 1/? ) ( 1 1 1/? 1/? ) = 1/? ( ) ( 1/? 1/? ) = ( 1/? 1/? so the eigenvectors are (1/?, 1/? ) 1 with eigenvalue and ( 1/?, 1/? ) 1 with eigenvalue. The matrix ( ) has the same eigenvectors, but (1/?, 1/? ) 1 has eigenvalue and ( 1/?, 1/? ) 1 has eigenvalue. In other words, the column spans of these matrices are one-dimensional linear spaces (lines), and these two lines are orthogonal. In summary, high correlation makes the density look linear in certain dimensions. We ll come back to this a bit later. ), Parameter estimation Parameter estimation is often done by maximum likelihood. The maximum likelihood estimators are available in closed form as Remark 3. The maximum likelihood estimators of the mean µ and covariance Σ of the multivariate normal distribution are pµ = sx pσ = S n. The book has a proof, though I think it is not the most straightforward one. Marginal and conditional distributions The marginal and conditional distributions of a multivariate normal are also very nice. We will partition vectors and matrices in the following way. For a random p-vector X, put X = (Y, Z) 1, with Y a random q-vector, and Z a random (p q)-vector. Write the mean of X as µ = (α, β) 1, with α a q-vector, and β a (p q)-vector, and the covariance of X as Σ = ( Ψ Λ 1 ) Λ Φ

7 stat 6: the multivariate normal distribution 7 for a q ˆ q matrix Ψ, a q ˆ (p q) matrix Λ, and a (p q) ˆ (p q) matrix Φ. Also, when considering a partitioned vector or matrix, will be understood to mean that one block of the vector or matrix is all zero. For example ( ) Ψ Λ Σ = Λ 1 means that the lower-right (p q) ˆ (p q) block of Σ consists entirely of zeros. We have the following Theorem 4. Suppose X = (Y, Z) 1 N(µ, Σ), then Y N(α, Ψ) and Z N(β, Φ). Proof. Corollary applied to A = (I q, ) with A a q ˆ p matrix. In other words, if a random vector has a multivariate normal distribution, we can read off the marginal distribution of any subset of its coordinates by taking the corresponding entries of µ and Σ. Conditional distributions are also convenient. 6 Theorem 5. Suppose X = (Y, Z) 1 N(µ, Σ), then Y Z = z N(α + ΛΦ 1 (z β), Ψ ΛΦ 1 Λ 1 ) See the book for a proof. Finally, we have a well-known result on the correspondance between correlation and independence for jointly normal random variables. 6 given without proof, and with my apologies to anyone who worries about Borel s paradox. It s worth noting that the conditional mean is actually what one obtains under a multivariate linear regression of y on z. Theorem 6. Suppose X = (Y, Z) 1 N(µ, Σ), then Y K Z if and only if Λ = is a q ˆ (p q) matrix of zeros. This is a rather unique property of the multivariate normal. It is not hard to generate examples of dependent random variables with zero correlation: Example. Define a random -vector as follows. First, choose θ Uniform( π, π), then set (X, Y ) 1 = (sin(θ) + ϵ 1, cos(θ) + ϵ ) 1. for ϵ 1, ϵ independent and marginally N(, τ ). Then cor(x, Y ) =, but X M Y. Let s see what this looks like thet <- runif(1,-pi,pi) x <- sin(thet) + rnorm(1,,.1) y <- cos(thet) + rnorm(1,,.1) df <- data.frame(x=x,y=y) ggplot(df,aes(x=x,y=y)) + geom_point(size=.3)

8 1. y stat 6: the multivariate normal distribution.5..5 The sample correlation in this example is.7, but clearly, X and Y are not independent, since they are functions of the same random variable θ. Here is another example, this time where X and Y are marginally normal, but not jointly normal. Example 3. Let X N (, 1) and ξ a discrete random variable, independent of X, with probability mass function x Figure 3: Data sampled from the distribution in the previous example. P[ξ = 1] = P[ξ = 1] = 1/. Put Y = ξx. Then Y N (, 1) marginally, cor(x, Y ) =, but X MY. Again, let s sample from this distribution and visualize So having normal marginal distributions is not enough for joint normality. Marginals tell us nothing at all about dependence, and in fact it is quite easy to generate an arbitrary joint distribution with normal marginals. Suppose X is a random vector with arbitrary joint distribution, and marginal (continuous) CDFs F1,..., Fp, so that P[Xj ď t] = Fj (t). Let Φ 1 be the standard normal quantile function.7 Then Y = (Φ 1 (F (X1 )),..., Φ 1 (F (Xp )))1 has Yj N (, 1) for every j, but in general Y is not jointly normal.8 As such, the suggestion in the book that one assess normality by checking or testing for normality of the marginals seems misplaced. One can easily apply some transformation to the marginals to make them very nearly normal, but this does not mean that the resulting vector is even close to jointly normal. That said, transforming the marginals to be approximately normal is an important first step to assessing the plausibility of joint normality, for example by making bivariate scatter plots. These bivariate plots won t look remotely normal if the marginals are far from normal, but that doesn t indicate that there is no simple transformation to joint normality. The book suggests a few transformations that often make the marginals close to normal:.5 y x <- rnorm(1) u <- runif(1) xi <- -1*(u<.5) + 1*(u>.5) y <- xi*x df <- data.frame(x=x,y=y) ggplot(df,aes(x=x,y=y)) + geom_point(size=.3) x Figure 4: Data sampled from the distribution in the previous example. 7 aka inverse CDF 8 Some of you may recognize this as a simple consequence of Sklar s theorem regarding copulas. In fact, every copula can give rise to a joint distribution with normal marginals, but only a Gaussian copula can give rise to a multivariate normal distribution.

9 stat 6: the multivariate normal distribution 9 z = log(x/(1 x)) for x P [, 1], z = a (x) for x P t, 1,...u z = 1 log((1 + x)/(1 x)) for x P [ 1, 1]. Another useful family are the power transformations, of which the Box-Cox transformations z = xλ 1, λ with z = log(x) for λ =, are best known. The parameter λ is chosen to maximize the normal likelihood, with the sample mean and covariance of the transformed data substituted for the mean and covariance (see the book for details). Finally, one can transform marginals using the empirical CDF. Recall that if X F, then the random variable F (X) has a uniform distribution on [, 1], and that if G 1 is a inverse CDF (quantile function), then G 1 (U) G for U Uniform(, 1). This is generally referred to as the probability integral transform. Suppose we have observations on a univariate random variable X, and we know the CDF of X is F. Then, for a sample x 1,..., x n of data, we could transform each observation to get u 1 = F (x 1 ),..., u n = F (x n ), which would be a sample from a uniform distribution. Then, we could make the second transformation Φ 1 (u 1 ),..., Φ 1 (u n ) where Φ 1 is the standard normal quantile function to get a sample from a univariate normal distribution. In general, we don t know F, but we can estimate it by the empirical CDF pf (t) = n 1 ÿ n 1 txiďtu, the proportion of observed data less than or equal to any value t. R has a built-in function to compute p F. Then, we can transform to approximately standard normal data by i=1 z = Φ 1 ( p F (x)). Distributions of sample mean and covariance When a sample is iid multivariate normal, the sample mean is also normal, and the sample covariance has a Wishart distribution. Theorem 7. Suppose X 1,..., X n (a collection of n vectors of length

10 stat 6: the multivariate normal distribution 1 p) are iid multivariate normal with mean µ and covariance Σ. Then sx N(µ, n 1 Σ) (n 1)S W p (n 1, Σ), and X s K (n 1)S. W p (ν, Σ) has density f(φ) = p Σ n/ Γ p (n/) Φ (n p 1)/ exp( tr(σ 1 Φ)/). The Wishart distribution is a distribution on matrix-valued random variables, and arises as the distribution of ř i Z izi 1 for Z i N(, Σ). Its parameters are degrees of freedom ν and a positive-definite scale matrix Σ, and we write Φ W p (ν, Σ) to indicate a random p ˆ p matrix has the Wishart distribution with parameters ν and Σ. 9 We can generate random matrices from a Wishart distribution in R using the package MCMCpack, like this. 9 Note the difference in the book s notation p <- n <- 5 Sigma <- matrix(c(1,.9,.9,1),,) Phi <- rwish(n-1,sigma) print(phi) [,1] [,] [1,] [,] We also have three strong asymptotic results mentioned in a previous lecture under the much weaker condition that the sampling process has finite mean and covariance. Theorem 8. Suppose X 1,..., X n is a random sample from a distribution with finite mean µ and covariance Σ. The following results hold for n Ñ 8 with p fixed: sx n = n 1 nÿ i=1 X i i.p. Ñ µ S, S n i.p. Ñ Σ? n( s Xn µ) D Ñ N(, Σ) (the multivariate CLT) n( s X n µ) 1 S 1 n ( s X n µ) D Ñ χ p. The book provides some justification of the first two. The proof of the multivariate central limit theorem is not trivial, but among the simpler proofs uses characteristic functions, and variations of it can be found in most asymptotics textbooks. I would add to the book s (informal) justifications for these results that the last line apparently follows from Slutsky s theorem once we have the CLT.

11 stat 6: the multivariate normal distribution 11 Assessing normality e.g. Box-Cox transformations, use of the empirical CDF axp.45 ba load('../../datasets-other/djia/djia.rdata') df <- djia.ldr[,3:6] n <- nrow(df) for (j in 1:4) { Fj <- ecdf(df[,j]) df[,j] <- qnorm(fj(df[,j])-1/(*n)) } ggpairs(df,lower=list(continuous=wrap("points",size=.1))) cat The book makes a number of suggestions of how to assess joint normality. As we ve already discussed, I won t focus much on the marginals. If one plans to model data as jointly normal, of course it will be necessary to transform the marginals to make them nearly normal. This is all very standard and procedures for doing so1 are covered in univariate statistics. The harder issue is whether the data are plausibly jointly normal, even after transforming the marginals. Let s get back to some real data. Here are some bivariate plots of the first four stocks from the djia data. Before plotting, I use the empirical CDF to transform the marginals to be approximately normal, so when looking at the scatter plots, any non-normal appearance is entirely due to the nature of dependence, not non-normality of the marginals..39 (xi x sn )1 (Sn ) 1 (xi x sn ), which we already know is asymptotically χ distributed. Let s try one of these tests, Mardia s test, on the components of the djia data that we plotted above. Mardia s test actually consists of two test statistics, a skewness and a kurtosis statistic. These are based on multivariate generalizations of higher moments, and the intuition is that since the multivariate normal is completely determined by its first two moments (mean and covariance), higher order moments should be a function of µ, Σ. Measuring the discrepancy between observed and expected values of the higher-order moments based on estimates of µ and Σ gives us a test statistic that is approximately χ distributed under the null. We can perform the test with the MVN package. 4 4 csco Does that look bivariate Gaussian to you? Probably not, in my view. But this is just exploratory. There are a number of tests for multivariate normality that the book does not discuss. Some of these tests are implemented in the R package MVN. The test statistics and associated distribution theory are messy, so we won t go into that here.11 A brief intuition is that most such test statistics are some function of the squared Mahalanobis distance 4 4 axp 44 ba 44 cat 44 csco Figure 5: Bivariate scatter plots and marginal density plots for four stocks in the djia data. 11 See web/packages/mvn/vignettes/mvn. pdf if interested 4

12 stat 6: the multivariate normal distribution 1 mardiatest(df,qqplot=f) Mardia's Multivariate Normality Test data : df g1p : chi.skew : p.value.skew : e-6 gp : z.kurtosis : p.value.kurt : chi.small.skew : p.value.small : e-6 Result : Data are not multivariate normal The package nicely points out the obvious conclusion based on the output: we reject the null at the.5 level (or the.1 level, or any level people tend to choose, really). These data appear not to be jointly normal. A few important notes. (1) these tests are based on the asymptotic distribution of the test statistic, so p values will be approximate in small samples. Fortunately, the sample size here is relatively large compared to (p + p)/ (the number of free paramaeters in the covariance and mean put together). Also, the djia data are not iid, since they exhibit serial dependence. Thus, the p-values in the output are too small. The way to think about this is that because there is dependence between sequential observations, each observation carries less information than an observation from an iid sample. Nonetheless, the conclusion of the test is so clear-cut, that I suspect it is robust to the failure to account for autocorrelation. How about the lizard data? Let s remake that plot from the first lecture, this time transforming the marginals. lizard <- read.csv('../../datasets/t1-3.dat',sep=' ',header=f) lizard <- lizard[,c(3,5,7)] names(lizard) <- c('mass','svl','hls') n <- nrow(lizard); p <- ncol(lizard) for (j in 1:p) { Fj <- ecdf(lizard[,j]) lizard[,j] <- qnorm(fj(lizard[,j])-1/(*n)) } ggpairs(lizard,lower=list(continuous=wrap("points",size=.1)))

13 stat 6: the multivariate normal distribution 13 These have high pairwise correlation, but they look more plausibly normal. Let s try the test. mardiatest(lizard,qqplot=f) Mardia's Multivariate Normality Test data : lizard g1p : chi.skew : p.value.skew : gp : z.kurtosis : p.value.kurt : chi.small.skew : p.value.small : Result : Data are multivariate normal mass svl hls mass svl hls Figure 6: bivariate plots of the lizard data It looks like these data probably are multivariate normal. 1 Again, a word of caution: there are only 5 observations here, so the p values are somewhat approximate. Does this make scientific sense? I think so. A gross generalization is that things like body dimensions what we are measuring for the lizards often are nearly normally distributed in the population. 13 On the other hand, things like stock prices tend not to be. The nature of dependence in financial markets simply isn t well-approximated by a multivariate normal distribution. 14 It s important to keep in mind that even when the multivariate normal is not a good fit for the data, it does not mean that the sample mean and sample covariance are not good summaries of the data. Indeed, many of the methods we ll use in this course don t assume multivariate normality. Multivariate normality (or asymptotic multivariate normality) is somewhat more important in testing, for example, than it is when using methods like PCA for data reduction and summary. References 1 That is, we fail to reject the null hypothesis that they are multivariate normal 13 One of the earliest studies of bivariate association used data on height and weight 14 one possible explanation for this is that the multivariate normal distribution has no tail dependence, that is, no matter how high the correlation, the largest observations for each coordinate tend to be nearly independent in large samples. On the other hand, we expect the largest (or smallest) observations from financial time series to be the most dependent, so intuitively the multivariate normal seems like a bad fit.

Stat 206: Sampling theory, sample moments, mahalanobis

Stat 206: Sampling theory, sample moments, mahalanobis Stat 206: Sampling theory, sample moments, mahalanobis topology James Johndrow (adapted from Iain Johnstone s notes) 2016-11-02 Notation My notation is different from the book s. This is partly because

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

Stat 206: Linear algebra

Stat 206: Linear algebra Stat 206: Linear algebra James Johndrow (adapted from Iain Johnstone s notes) 2016-11-02 Vectors We have already been working with vectors, but let s review a few more concepts. The inner product of two

More information

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ). .8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 2: Multivariate distributions and inference Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2016/2017 Master in Mathematical

More information

STT 843 Key to Homework 1 Spring 2018

STT 843 Key to Homework 1 Spring 2018 STT 843 Key to Homework Spring 208 Due date: Feb 4, 208 42 (a Because σ = 2, σ 22 = and ρ 2 = 05, we have σ 2 = ρ 2 σ σ22 = 2/2 Then, the mean and covariance of the bivariate normal is µ = ( 0 2 and Σ

More information

1. Density and properties Brief outline 2. Sampling from multivariate normal and MLE 3. Sampling distribution and large sample behavior of X and S 4.

1. Density and properties Brief outline 2. Sampling from multivariate normal and MLE 3. Sampling distribution and large sample behavior of X and S 4. Multivariate normal distribution Reading: AMSA: pages 149-200 Multivariate Analysis, Spring 2016 Institute of Statistics, National Chiao Tung University March 1, 2016 1. Density and properties Brief outline

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Gaussian Models (9/9/13)

Gaussian Models (9/9/13) STA561: Probabilistic machine learning Gaussian Models (9/9/13) Lecturer: Barbara Engelhardt Scribes: Xi He, Jiangwei Pan, Ali Razeen, Animesh Srivastava 1 Multivariate Normal Distribution The multivariate

More information

Notes on Random Vectors and Multivariate Normal

Notes on Random Vectors and Multivariate Normal MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution

More information

The Multivariate Gaussian Distribution [DRAFT]

The Multivariate Gaussian Distribution [DRAFT] The Multivariate Gaussian Distribution DRAFT David S. Rosenberg Abstract This is a collection of a few key and standard results about multivariate Gaussian distributions. I have not included many proofs,

More information

Stat 206: Estimation and testing for a mean vector,

Stat 206: Estimation and testing for a mean vector, Stat 206: Estimation and testing for a mean vector, Part II James Johndrow 2016-12-03 Comparing components of the mean vector In the last part, we talked about testing the hypothesis H 0 : µ 1 = µ 2 where

More information

Elliptically Contoured Distributions

Elliptically Contoured Distributions Elliptically Contoured Distributions Recall: if X N p µ, Σ), then { 1 f X x) = exp 1 } det πσ x µ) Σ 1 x µ) So f X x) depends on x only through x µ) Σ 1 x µ), and is therefore constant on the ellipsoidal

More information

Multivariate Distributions

Multivariate Distributions Copyright Cosma Rohilla Shalizi; do not distribute without permission updates at http://www.stat.cmu.edu/~cshalizi/adafaepov/ Appendix E Multivariate Distributions E.1 Review of Definitions Let s review

More information

Next is material on matrix rank. Please see the handout

Next is material on matrix rank. Please see the handout B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0

More information

The Multivariate Gaussian Distribution

The Multivariate Gaussian Distribution The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Addressing ourliers 1 Addressing ourliers 2 Outliers in Multivariate samples (1) For

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Lecture 11. Multivariate Normal theory

Lecture 11. Multivariate Normal theory 10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

Table of Contents. Multivariate methods. Introduction II. Introduction I

Table of Contents. Multivariate methods. Introduction II. Introduction I Table of Contents Introduction Antti Penttilä Department of Physics University of Helsinki Exactum summer school, 04 Construction of multinormal distribution Test of multinormality with 3 Interpretation

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

Exam 2. Jeremy Morris. March 23, 2006

Exam 2. Jeremy Morris. March 23, 2006 Exam Jeremy Morris March 3, 006 4. Consider a bivariate normal population with µ 0, µ, σ, σ and ρ.5. a Write out the bivariate normal density. The multivariate normal density is defined by the following

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Multivariate Analysis Homework 1

Multivariate Analysis Homework 1 Multivariate Analysis Homework A490970 Yi-Chen Zhang March 6, 08 4.. Consider a bivariate normal population with µ = 0, µ =, σ =, σ =, and ρ = 0.5. a Write out the bivariate normal density. b Write out

More information

8 Eigenvectors and the Anisotropic Multivariate Gaussian Distribution

8 Eigenvectors and the Anisotropic Multivariate Gaussian Distribution Eigenvectors and the Anisotropic Multivariate Gaussian Distribution Eigenvectors and the Anisotropic Multivariate Gaussian Distribution EIGENVECTORS [I don t know if you were properly taught about eigenvectors

More information

Notes on the Multivariate Normal and Related Topics

Notes on the Multivariate Normal and Related Topics Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions

More information

Random vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables.

Random vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables. Random vectors Recall that a random vector X = X X 2 is made up of, say, k random variables X k A random vector has a joint distribution, eg a density f(x), that gives probabilities P(X A) = f(x)dx Just

More information

Volume in n Dimensions

Volume in n Dimensions Volume in n Dimensions MA 305 Kurt Bryan Introduction You ve seen that if we have two vectors v and w in two dimensions then the area spanned by these vectors can be computed as v w = v 1 w 2 v 2 w 1 (where

More information

The Multivariate Normal Distribution 1

The Multivariate Normal Distribution 1 The Multivariate Normal Distribution 1 STA 302 Fall 2014 1 See last slide for copyright information. 1 / 37 Overview 1 Moment-generating Functions 2 Definition 3 Properties 4 χ 2 and t distributions 2

More information

3. Probability and Statistics

3. Probability and Statistics FE661 - Statistical Methods for Financial Engineering 3. Probability and Statistics Jitkomut Songsiri definitions, probability measures conditional expectations correlation and covariance some important

More information

Multivariate Time Series: VAR(p) Processes and Models

Multivariate Time Series: VAR(p) Processes and Models Multivariate Time Series: VAR(p) Processes and Models A VAR(p) model, for p > 0 is X t = φ 0 + Φ 1 X t 1 + + Φ p X t p + A t, where X t, φ 0, and X t i are k-vectors, Φ 1,..., Φ p are k k matrices, with

More information

3d scatterplots. You can also make 3d scatterplots, although these are less common than scatterplot matrices.

3d scatterplots. You can also make 3d scatterplots, although these are less common than scatterplot matrices. 3d scatterplots You can also make 3d scatterplots, although these are less common than scatterplot matrices. > library(scatterplot3d) > y par(mfrow=c(2,2)) > scatterplot3d(y,highlight.3d=t,angle=20)

More information

The Instability of Correlations: Measurement and the Implications for Market Risk

The Instability of Correlations: Measurement and the Implications for Market Risk The Instability of Correlations: Measurement and the Implications for Market Risk Prof. Massimo Guidolin 20254 Advanced Quantitative Methods for Asset Pricing and Structuring Winter/Spring 2018 Threshold

More information

The Multivariate Normal Distribution 1

The Multivariate Normal Distribution 1 The Multivariate Normal Distribution 1 STA 302 Fall 2017 1 See last slide for copyright information. 1 / 40 Overview 1 Moment-generating Functions 2 Definition 3 Properties 4 χ 2 and t distributions 2

More information

5.6 The Normal Distributions

5.6 The Normal Distributions STAT 41 Lecture Notes 13 5.6 The Normal Distributions Definition 5.6.1. A (continuous) random variable X has a normal distribution with mean µ R and variance < R if the p.d.f. of X is f(x µ, ) ( π ) 1/

More information

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation PRE 905: Multivariate Analysis Spring 2014 Lecture 4 Today s Class The building blocks: The basics of mathematical

More information

01 Probability Theory and Statistics Review

01 Probability Theory and Statistics Review NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j. Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini April 27, 2018 1 / 1 Table of Contents 2 / 1 Linear Algebra Review Read 3.1 and 3.2 from text. 1. Fundamental subspace (rank-nullity, etc.) Im(X ) = ker(x T ) R

More information

Topics in Probability and Statistics

Topics in Probability and Statistics Topics in Probability and tatistics A Fundamental Construction uppose {, P } is a sample space (with probability P), and suppose X : R is a random variable. The distribution of X is the probability P X

More information

STA442/2101: Assignment 5

STA442/2101: Assignment 5 STA442/2101: Assignment 5 Craig Burkett Quiz on: Oct 23 rd, 2015 The questions are practice for the quiz next week, and are not to be handed in. I would like you to bring in all of the code you used to

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Stat 206, Week 6: Factor Analysis

Stat 206, Week 6: Factor Analysis Stat 206, Week 6: Factor Analysis James Johndrow 2016-09-24 Introduction Factor analysis aims to explain correlation between a large set (p) of variables in terms of a smaller number (m) of underlying

More information

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

STAT 730 Chapter 4: Estimation

STAT 730 Chapter 4: Estimation STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum

More information

STA 2101/442 Assignment 3 1

STA 2101/442 Assignment 3 1 STA 2101/442 Assignment 3 1 These questions are practice for the midterm and final exam, and are not to be handed in. 1. Suppose X 1,..., X n are a random sample from a distribution with mean µ and variance

More information

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math. Regression, part II I. What does it all mean? A) Notice that so far all we ve done is math. 1) One can calculate the Least Squares Regression Line for anything, regardless of any assumptions. 2) But, if

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

STA 2201/442 Assignment 2

STA 2201/442 Assignment 2 STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution

More information

conditional cdf, conditional pdf, total probability theorem?

conditional cdf, conditional pdf, total probability theorem? 6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random

More information

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality. 88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Multivariate Gaussians Mark Schmidt University of British Columbia Winter 2019 Last Time: Multivariate Gaussian http://personal.kenyon.edu/hartlaub/mellonproject/bivariate2.html

More information

Notes on Random Variables, Expectations, Probability Densities, and Martingales

Notes on Random Variables, Expectations, Probability Densities, and Martingales Eco 315.2 Spring 2006 C.Sims Notes on Random Variables, Expectations, Probability Densities, and Martingales Includes Exercise Due Tuesday, April 4. For many or most of you, parts of these notes will be

More information

STAT 4385 Topic 01: Introduction & Review

STAT 4385 Topic 01: Introduction & Review STAT 4385 Topic 01: Introduction & Review Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 Outline Welcome What is Regression Analysis? Basics

More information

Principal Components Theory Notes

Principal Components Theory Notes Principal Components Theory Notes Charles J. Geyer August 29, 2007 1 Introduction These are class notes for Stat 5601 (nonparametrics) taught at the University of Minnesota, Spring 2006. This not a theory

More information

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015 STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots March 8, 2015 The duality between CI and hypothesis testing The duality between CI and hypothesis

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2 Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate

More information

The Delta Method and Applications

The Delta Method and Applications Chapter 5 The Delta Method and Applications 5.1 Local linear approximations Suppose that a particular random sequence converges in distribution to a particular constant. The idea of using a first-order

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Volatility. Gerald P. Dwyer. February Clemson University

Volatility. Gerald P. Dwyer. February Clemson University Volatility Gerald P. Dwyer Clemson University February 2016 Outline 1 Volatility Characteristics of Time Series Heteroskedasticity Simpler Estimation Strategies Exponentially Weighted Moving Average Use

More information

Appendix 2. The Multivariate Normal. Thus surfaces of equal probability for MVN distributed vectors satisfy

Appendix 2. The Multivariate Normal. Thus surfaces of equal probability for MVN distributed vectors satisfy Appendix 2 The Multivariate Normal Draft Version 1 December 2000, c Dec. 2000, B. Walsh and M. Lynch Please email any comments/corrections to: jbwalsh@u.arizona.edu THE MULTIVARIATE NORMAL DISTRIBUTION

More information

Probabilities & Statistics Revision

Probabilities & Statistics Revision Probabilities & Statistics Revision Christopher Ting Christopher Ting http://www.mysmu.edu/faculty/christophert/ : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 January 6, 2017 Christopher Ting QF

More information

18 Bivariate normal distribution I

18 Bivariate normal distribution I 8 Bivariate normal distribution I 8 Example Imagine firing arrows at a target Hopefully they will fall close to the target centre As we fire more arrows we find a high density near the centre and fewer

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

:Effects of Data Scaling We ve already looked at the effects of data scaling on the OLS statistics, 2, and R 2. What about test statistics?

:Effects of Data Scaling We ve already looked at the effects of data scaling on the OLS statistics, 2, and R 2. What about test statistics? MRA: Further Issues :Effects of Data Scaling We ve already looked at the effects of data scaling on the OLS statistics, 2, and R 2. What about test statistics? 1. Scaling the explanatory variables Suppose

More information

Probability Background

Probability Background Probability Background Namrata Vaswani, Iowa State University August 24, 2015 Probability recap 1: EE 322 notes Quick test of concepts: Given random variables X 1, X 2,... X n. Compute the PDF of the second

More information

CS229 Lecture notes. Andrew Ng

CS229 Lecture notes. Andrew Ng CS229 Lecture notes Andrew Ng Part X Factor analysis When we have data x (i) R n that comes from a mixture of several Gaussians, the EM algorithm can be applied to fit a mixture model. In this setting,

More information

Independent Component (IC) Models: New Extensions of the Multinormal Model

Independent Component (IC) Models: New Extensions of the Multinormal Model Independent Component (IC) Models: New Extensions of the Multinormal Model Davy Paindaveine (joint with Klaus Nordhausen, Hannu Oja, and Sara Taskinen) School of Public Health, ULB, April 2008 My research

More information

Lecture 2: Review of Basic Probability Theory

Lecture 2: Review of Basic Probability Theory ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent

More information

L03. PROBABILITY REVIEW II COVARIANCE PROJECTION. NA568 Mobile Robotics: Methods & Algorithms

L03. PROBABILITY REVIEW II COVARIANCE PROJECTION. NA568 Mobile Robotics: Methods & Algorithms L03. PROBABILITY REVIEW II COVARIANCE PROJECTION NA568 Mobile Robotics: Methods & Algorithms Today s Agenda State Representation and Uncertainty Multivariate Gaussian Covariance Projection Probabilistic

More information

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants 18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37 Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009

More information

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

BIOS 2083 Linear Models Abdus S. Wahed. Chapter 2 84

BIOS 2083 Linear Models Abdus S. Wahed. Chapter 2 84 Chapter 2 84 Chapter 3 Random Vectors and Multivariate Normal Distributions 3.1 Random vectors Definition 3.1.1. Random vector. Random vectors are vectors of random variables. For instance, X = X 1 X 2.

More information

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2 Final Review Sheet The final will cover Sections Chapters 1,2,3 and 4, as well as sections 5.1-5.4, 6.1-6.2 and 7.1-7.3 from chapters 5,6 and 7. This is essentially all material covered this term. Watch

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Data from one or a series of random experiments are collected. Planning experiments and collecting data (not discussed here). Analysis:

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30 Random Vectors 1 STA442/2101 Fall 2017 1 See last slide for copyright information. 1 / 30 Background Reading: Renscher and Schaalje s Linear models in statistics Chapter 3 on Random Vectors and Matrices

More information

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011 Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011 Outline Ordinary Least Squares (OLS) Regression Generalized Linear Models

More information

Random Vectors and Multivariate Normal Distributions

Random Vectors and Multivariate Normal Distributions Chapter 3 Random Vectors and Multivariate Normal Distributions 3.1 Random vectors Definition 3.1.1. Random vector. Random vectors are vectors of random 75 variables. For instance, X = X 1 X 2., where each

More information

Bayesian decision theory Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory

Bayesian decision theory Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory Bayesian decision theory 8001652 Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory Jussi Tohka jussi.tohka@tut.fi Institute of Signal Processing Tampere University of Technology

More information

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception PROBABILITY DISTRIBUTIONS Credits 2 These slides were sourced and/or modified from: Christopher Bishop, Microsoft UK Parametric Distributions 3 Basic building blocks: Need to determine given Representation:

More information

Problem 1 (20) Log-normal. f(x) Cauchy

Problem 1 (20) Log-normal. f(x) Cauchy ORF 245. Rigollet Date: 11/21/2008 Problem 1 (20) f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 4 2 0 2 4 Normal (with mean -1) 4 2 0 2 4 Negative-exponential x x f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.5

More information

Factor Analysis (10/2/13)

Factor Analysis (10/2/13) STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

Lecture 3: Statistical sampling uncertainty

Lecture 3: Statistical sampling uncertainty Lecture 3: Statistical sampling uncertainty c Christopher S. Bretherton Winter 2015 3.1 Central limit theorem (CLT) Let X 1,..., X N be a sequence of N independent identically-distributed (IID) random

More information

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u Interval estimation and hypothesis tests So far our focus has been on estimation of the parameter vector β in the linear model y i = β 1 x 1i + β 2 x 2i +... + β K x Ki + u i = x iβ + u i for i = 1, 2,...,

More information

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu

More information

Multivariate Normal-Laplace Distribution and Processes

Multivariate Normal-Laplace Distribution and Processes CHAPTER 4 Multivariate Normal-Laplace Distribution and Processes The normal-laplace distribution, which results from the convolution of independent normal and Laplace random variables is introduced by

More information

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA)

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) CS68: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) Tim Roughgarden & Gregory Valiant April 0, 05 Introduction. Lecture Goal Principal components analysis

More information

Building Infinite Processes from Finite-Dimensional Distributions

Building Infinite Processes from Finite-Dimensional Distributions Chapter 2 Building Infinite Processes from Finite-Dimensional Distributions Section 2.1 introduces the finite-dimensional distributions of a stochastic process, and shows how they determine its infinite-dimensional

More information

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline. Practitioner Course: Portfolio Optimization September 10, 2008 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y ) (x,

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information