Stat 206: Sampling theory, sample moments, mahalanobis

Size: px
Start display at page:

Download "Stat 206: Sampling theory, sample moments, mahalanobis"

Transcription

1 Stat 206: Sampling theory, sample moments, mahalanobis topology James Johndrow (adapted from Iain Johnstone s notes) Notation My notation is different from the book s. This is partly because I am going to be writing on the board, and having less complicated notation makes that easier. I am also just a minimalist when it comes to notation. My notation is: symbol description what it is X upper case letter matrix, random variable x lower case letter vector Compare this to the book s notation: symbol description what it is X upper case bold letter matrix (except data) X upper case bold letter vector random variable x lower case bold letter vector x lower case non-bold letter scalar X bigger upper case bold letter data matrix There are clearly pros and cons. Here are possible points where confusion may arise using my notation: 1. Random variables. I ll use X to refer both random variables and the data matrix. It will usually be obvious from the context which I am talking about. In particular, if I write X f(x), E[X], et cetera i.e. anytime I make a probability statement I am referring to the random variable. 2. Subscripting. We will sometimes talk about a collection of random vectors X 1,..., X n where each X i is a vector. We will also talk about the data matrix X ij, which might look like the jth entry of the ith random vector. Again, it will hopefully be clear from context which I mean, and if not, I ll make an effort to point it out. As a rule of thumb, something with a single subscript i will usually be the ith vector in a collection, and something with a single subscript j will (usually) refer to the jth component of a vector (also see the next point). 3. Indexing. the book uses j to index observations and k to index variables. I will use i to index observations and j to index variables. The notation I use is more common in statistics, so if I tried

2 stat 206: sampling theory, sample moments, mahalanobis topology 2 to switch to use the book s notation I would inevitably fall back into my usual habit, causing even more confusion. I apologize in advance for having to keep track of different notation. Random sampling By and large we will assume that our data x = (x 1,..., x p ) 1 are independent realizations of a vector random variable X with a density f : R d Ñ R on R p, that is 1 X f. When we write X f, we mean the data distribution has a density satisfying ş f(x)dx = 1. 2 R p We commonly need to partition X as X = ( X 1 X 2 ) 1 backtick ( 1 ) will refer to the transpose of a vector or matrix, i.e. the object with row and column indices switched. By default, vectors are column vectors, so their transposes are row vectors. 2 A common situation in which independence is violated is in time series applications or longitudinal studies, but the principles we learn by studying independence can be applied to develop methods for non-independent samples. where X 1, X 2 are random vectors of dimension p 1, p 2 with p 1 + p 2 = p. Then the marginal density of X 1 is ż f 1 (x 1 ) = R p 2 f(x 1, x 2 )dx 2 and the conditional density of x 2 given x 1 = x 0 1 is f(x 2 x 1 ) = f(x0 1, x 2 ) f 1 (x 0 1 ). Statistical independence occurs when f(x 2 x 0 1) = f(x 2 ) for all x 0 1 P R p 1. When X 1 is independent of X 2 we write X 1 K X 2. Theorem 1. If X 1 K X 2 then f(x) = f 1 (x 1 )f 2 (x 2 ). Bayes theorm allows us to reverse conditional probabilities. Suppose we have random variables (Θ, X) and Θ f(θ) is the prior, X Θ f(x θ) is the likelihood or sampling model then the joint density of (Θ, X) is f(θ)f(x θ) and the marginal density of X is ż f(x) = f(θ)f(x θ)dθ. Then

3 stat 206: sampling theory, sample moments, mahalanobis topology 3 Theorem 2 (Bayes). The posterior density of Θ X is the conditional distribution of parameters given observables, and is given by f(θ x) = f(x θ)f(θ). f(x) Note: each of Θ and X could be a multivariate vector or a discrete quantity (though in the latter case, we would replace with pmfs). The mean µ and variance of the vector variable X (when they exist) 3 are definined analogously to the univariate case 1. The population mean vector µ = EX has components ż µ j = x j f(x)dx. 3 In general we assume both the mean and variance exist and are finite 2. The population covariance matrix Σ = cov(x) = E[(X µ)(x µ) 1 ] The matrix Σ is p ˆ p and has entries σ jk = cov(x j, X k ) = E[(X j µ j )(X k µ k )] ż = (x j µ j )(x k µ k )f(x)dx. It follows that Σ is symmetric i.e. Σ = Σ 1 and non-negative definite (defined formally in next lecture). If we only wish to specify the first and second order moments of a random vector X, it is convenient to write X (µ, Σ), keeping in mind that this does not specify a particular distribution for X. Some key properties of means and covariances that we use frequently are Remark 1. Proof. Taking the expectation Σ = EXX 1 µµ 1 (X µ)(x µ) 1 = XX 1 µx 1 Xµ 1 + µµ 1. Σ = E[XX 1 ] µex 1 EXµ 1 + µµ 1 = E[XX 1 ] µµ 1.

4 stat 206: sampling theory, sample moments, mahalanobis topology 4 Another property is linearity Theorem 3 (linearity of expectation (vector)). E[AX + b] = AE[X]b = Aµ + b cov(ax) = A cov(x)a 1 = AΣA 1. Proof. [ ] ÿ E[(AX) j ] = E a jk X k = ÿ a jk E[X k ] k k (1) = (AEX) k = (Aµ) k (2) Now E[(AX Aµ)(AX Aµ) 1 ] = E[A(X µ)(x µ) 1 A 1 ] = A[E(X µ)(x µ) 1 ]A 1 = AΣA 1, where the next to last step, if written fully, would involve repeated use of linearity of expectation as in (1). Linear combinations are just a special case. If a P R p is a constant vector then a 1 X has moments Ea 1 X = a 1 µ var(a 1 X) = a 1 Σa = ÿ j,k a j σ jk a k. Paritioning vectors and matrices We often want to partition vectors and matrices in similar fashion to what we did for random variables. If we partition X as before, e.g X = ( X 1 X 2 then the mean of X and covariance matrix are partitioned conformably µ = ( µ 1 µ 2 ) ) ( ) Σ 11 Σ 22, Σ =. Σ 21 Σ 22 Writing this out in a little more detail, we have

5 stat 206: sampling theory, sample moments, mahalanobis topology 5 ( ) ( ) µ = EX = EX 1 µ 1 = EX 2 µ 2 ( ) Σ = E(X µ)(x µ) 1 X 1 µ ( 1 = E (X 1 µ 1 ) 1 X 2 µ 2 ) (X 2 µ 2 ) 1 ( ) = E(X E(X 1 µ 1 )(X 1 µ 1 ) 1 E(X 1 µ 1 )(X 2 µ 2 ) 1 2 µ 2 )(X 1 µ 1 ) 1 E(X 2 µ 2 )(X 2 µ 2 ) 1 ( ) = Σ 11 Σ 12 Σ 21 Σ 22. Notice that, by symmetry, Σ 12 = Σ It is sometimes useful to consider instead the correlation between components of X, which has the same interpretation regardless of the marginal variance of X: ρ jk = cor(x j, X k ) = cov(x j, X k ) a var(xj ) a P [ 1, 1], var(x k ) and the correlation matrix, often denoted P, the p ˆ p matrix with entries P jk = ρ jk. If V = diag(σ 11, σ 22,..., σ pp ), 4 where σ jj are the diagonal entries of Σ (the marginal variances), then we can express P as 4 this notation means the diagonal entries are given by the values inside the parentheses and all the off-diagonal entries are zero P = V 1/2 ΣV 1/2, where V 1/2 = diag(σ 1/2 11,..., σ 1/2 pp ). Sample moments We can now give some basic properties of the sample mean and sample covariance. The sample mean sx is given by sx = 1 n x i = ( 1 n x i1,..., 1 n ) 1 x ip = (sx 1,..., sx p ) 1. Since the x i are iid realizations of a random variable X f, E[ s X] = E [ 1 n ] X i = 1 n E[X i ] = (E[X 1 ],..., E[X p ]) 1 = (µ 1,..., µ p ) 1

6 stat 206: sampling theory, sample moments, mahalanobis topology 6 so the expectation of the sample mean is the mean µ of the random vector X with density f. The sample covariance matrix S n is defined as (S n ) jk = 1 n (x ij sx j )(x ik sx k ). So we can express the sample covariance matrix as the sum of matrices: S n = 1 n (x i sx)(x i sx) 1. Now we ll state an important result about the sample moments and prove part of it (see the book for the rest of the proof). Theorem 4. The covariance of the sample mean is cov( s X) = 1 n Σ and the expectation of the sample covariance is E[S n ] = Σ n 1 Σ so n(n 1) 1 S n = S is an unbiased estimator of Σ. Proof. We prove the second part, see the book for the first part. ( ) (X i sx)(x i sx) 1 = (X i sx)(x ÿ n i ) 1 (X i sx) = X i (X i ) 1 nx s X s 1, sx 1 since ř n (X i sx) = 0 and ( X) s 1 = n 1 ř n (X i) 1. So then [ ] E (X i sx)(x i sx) 1 = E[X i (X i ) 1 ] ne[ X s X s 1 ]. Now applying Remark 1, we have E[S n ] = n 1 E[X i (X i ) 1 ] E[ X s X s 1 ] = Σ + µµ 1 (n 1 Σ + µµ 1 ) = n 1 n Σ.

7 stat 206: sampling theory, sample moments, mahalanobis topology 7 We also can define the sample correlation matrix R by R jk = s jk? sjj? skk, with s jk = (S) jk. If we put D 1/2 = diag(s 1/2 11,..., s 1/2 pp ) then R = D 1/2 SD 1/2. Finally, note that the law of large numbers and the central limit theorem also work for vectors. We will give these results without proof, if interested there are many references. For our purposes, it is important just to know that these key asymptotic results also hold for vectors. Theorem 5 (Multivariate weak law of large numbers). Let X 1,..., X n be a sequence of iid length p random vectors with finite mean µ. Let sx n = n 1 ř n X i. Then P [ˇˇ s Xn µˇˇ ě ϵ ] Ñ 0 as n Ñ 8 for all ϵ ą 0, where x = ř p j=1 x j is the L 1 norm. Theorem 6 (Multivariate central limit theorem). Let X 1,..., X n be a sequence of iid length p random vectors with finite mean µ and finite covariance Σ. Then? n( s Xn µ) D Ñ No(0, Σ). In Theorem 6 No(0, Σ) is the multivariate normal distribution with mean 0 and covariance Σ, which we will soon characterize. Finally, we will briefly mention the notion of generalized variance, which the book (and numerous other sources) defines as S, the determinant of S. This is a sensible way to summarize the variability of the sample in a single number, and we will revisit it after we have reviewed a bit more linear algebra, which make its properties easier to understand. Vector norms and Mahalanobis topology When dealing with univariate random variables, the notion of magnitude is relatively straightforward. We generally take the magnitude of a real number to be its absolute value, which is of course equal to the Euclidean or L 2 norm? x 2 when x is unidimensional. However, for vectors the definition of magnitude is more subtle. One notion we have already mentioned is the L 1 norm, x = ř j x j, the sum of the

8 stat 206: sampling theory, sample moments, mahalanobis topology 8 absolute values of the entries. But this isn t equivalent to the L 2 norm for vectors d ÿ x 2 = (xx, xy) 1/2 = x 2 j ď ÿ j j b x 2 j = ÿ j x j by the triangle inequality, where xx, xy = x 1 x is the inner (or dot) product. The L 2 norm is arguably the default way to measure the magnitude of vectors, and it induces a metric on R p via d ÿ d(x, y) = x y 2 = (x j y j ) 2. (3) This is referred to as the Euclidean metric, which corresponds to the familiar straight line distance in R p. When considering distance between data points, it may not make sense to use the Euclidean metric. To understand why, it helps to know a little about quadratic forms. Definition 1. For a p-vector x and a p ˆ p symmetric matrix Λ, a quadratic form is given by the matrix product x 1 Λx. The expectation of quadratic forms is simple Theorem 7. Let X be a random vector with finite mean µ and finite covariance Σ. Then E[X 1 ΛX] = tr(λσ) + µ 1 Λµ. Proof. Note: we use properties of the trace of a matrix that will be discussed in the next lecture. j E[X 1 ΛX] = tr(e[x 1 ΛX]) = E[tr(X 1 ΛX)] = E[tr(ΛXX 1 )] = tre[λxx 1 ] = tr(λe[xx 1 ]) = tr(λ(σ + µµ 1 )) = tr(λσ) + tr(λµµ 1 ) = tr(λσ) + tr(µ 1 Λµ). In theorem 7, tr is the trace of a matrix, which is the sum of its diagonal elements. Using this result, we can understand the average distance between points in an iid sample.

9 stat 206: sampling theory, sample moments, mahalanobis topology 9 Theorem 8 (Expected Euclidean distance). Suppose X, Y are independent, identically distributed random variables with mean µ and covariance Σ. Then E[ X Y 2 2] = ÿ j σ jj. Proof. E[ X Y 2 2] = E[(X Y ) 1 I(X Y )] = tr(iσ) + (E[X Y ]) 1 I(E[X Y ]) = tr(σ) = ÿ j σ jj. Thus, the Euclidean distance between sample points will depend on the variances. This is an undesirable property, since we d like to be able to interpret distances between sample points in the same way for all samples we d like to have a common scale that means something similar no matter how our data were generated. So it makes sense to have a distance metric that scales by the inverse of the variances. We ll actually do a bit more than that, and focus on Mahalanobis distances. 5 Definition 2 (Mahalanobis distance). Given a p ˆ p symmetric, positive-definite matrix Λ, the Mahalnobis distance x y Λ between p-vectors x and y with respect to Λ is given by 5 Don t worry about the terms symmetric and positive definite for now, we ll soon define them. x y Λ = a (x y) 1 Λ 1 (x y) = d Λ (x, y). If we put Λ = Σ in the definition above, and measure distances using x y Σ, we have E[ X Y 2 Σ = E[(X Y ) 1 Σ 1 (X Y )] = tr(i) = n, no matter the value of Σ. 6 Additional motivation for using x y Σ will be offered when we study the multivariate normal distribution. How do distances in the Mahalanobis metric compare to the straight-line distances that we are used to? The best way to answer this is geometric. In the usual Euclidean metric, the set of all points x equidistant from a single point y are those that lie on the perimeters of circles centered at y, and the equation of all points a distance m from y is given by the equation of the circle (x y) 1 (x y) = m. In the Mahalanobis metric, the set of all points equidistant from y is given by an ellipse with axis lengths proportional to the inverse variances and orientation that is determined by the off-diagonal entries of Σ 1. 6 The notation Σ 1 refers to the inverse of the matrix Σ, that is, the matrix such that when multiplied by Σ gives the identity.

10 stat 206: sampling theory, sample moments, mahalanobis topology 10 Another way to think about this is that the action of a matrix Σ on a vector x via the matrix product Σ 1/2 x is to rotate and stretch the original vector. 7 You can think equivalently in terms of rotating and scaling the coordinate axes in p-dimensional space. Figure 1 shows the set of points that are distance 1 from the origin in the Euclidean metric (a circle) and the set of points equidistant from the origin in the Mahalanobis metric d Λ for ( ) 1.9 Λ = don t know what Σ 1/2 is? Don t worry, we ll get to that shortly pts <- ellipse(matrix(c(1,.9,.9,1),2,2),centre=c(0,0)) df <- data.frame(x=pts[,1],y=pts[,2]) pts2 <- ellipse(matrix(c(1,0,0,1),2,2),centre=c(0,0)) df2 <- data.frame(x=pts2[,1],y=pts2[,2]) df$cor <-.9 df2$cor <- 0 df <- rbind(df,df2) df$cor <- as.factor(df$cor) ggplot(df,aes(x=x,y=y,col=cor)) + geom_path() y cor Summary We have covered a number of properties of random vectors and multivariate samples. Importantly, what we have done so far required only (1) iid observations of a random variable X with a density f, (2) that X has finite mean and covariance. We made no other assumptions about X or f. We will soon shift focus to the study of the multivariate normal distribution. Because µ and Σ play an important role in understanding the multivariate normal, it is easy to lose sight of the fact that the sample mean and sample covariance have meaning and certain statistical properties regardless of whether f is the density of a multivariate normal. Keep this in mind as we move along x Figure 1: the set of points equidistant from the origin in the Euclidean metric (cor=0) and the Mahalanobis metric defined in text (cor=.9) References

3d scatterplots. You can also make 3d scatterplots, although these are less common than scatterplot matrices.

3d scatterplots. You can also make 3d scatterplots, although these are less common than scatterplot matrices. 3d scatterplots You can also make 3d scatterplots, although these are less common than scatterplot matrices. > library(scatterplot3d) > y par(mfrow=c(2,2)) > scatterplot3d(y,highlight.3d=t,angle=20)

More information

Stat 206: Linear algebra

Stat 206: Linear algebra Stat 206: Linear algebra James Johndrow (adapted from Iain Johnstone s notes) 2016-11-02 Vectors We have already been working with vectors, but let s review a few more concepts. The inner product of two

More information

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30 Random Vectors 1 STA442/2101 Fall 2017 1 See last slide for copyright information. 1 / 30 Background Reading: Renscher and Schaalje s Linear models in statistics Chapter 3 on Random Vectors and Matrices

More information

Notes on Random Vectors and Multivariate Normal

Notes on Random Vectors and Multivariate Normal MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution

More information

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ). .8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

More information

Basic Concepts in Matrix Algebra

Basic Concepts in Matrix Algebra Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1

More information

2. Matrix Algebra and Random Vectors

2. Matrix Algebra and Random Vectors 2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is

More information

Lecture 11. Multivariate Normal theory

Lecture 11. Multivariate Normal theory 10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances

More information

Vectors and Matrices Statistics with Vectors and Matrices

Vectors and Matrices Statistics with Vectors and Matrices Vectors and Matrices Statistics with Vectors and Matrices Lecture 3 September 7, 005 Analysis Lecture #3-9/7/005 Slide 1 of 55 Today s Lecture Vectors and Matrices (Supplement A - augmented with SAS proc

More information

Basic Concepts in Linear Algebra

Basic Concepts in Linear Algebra Basic Concepts in Linear Algebra Grady B Wright Department of Mathematics Boise State University February 2, 2015 Grady B Wright Linear Algebra Basics February 2, 2015 1 / 39 Numerical Linear Algebra Linear

More information

Elements of Probability Theory

Elements of Probability Theory Short Guides to Microeconometrics Fall 2016 Kurt Schmidheiny Unversität Basel Elements of Probability Theory Contents 1 Random Variables and Distributions 2 1.1 Univariate Random Variables and Distributions......

More information

Review of Basic Concepts in Linear Algebra

Review of Basic Concepts in Linear Algebra Review of Basic Concepts in Linear Algebra Grady B Wright Department of Mathematics Boise State University September 7, 2017 Math 565 Linear Algebra Review September 7, 2017 1 / 40 Numerical Linear Algebra

More information

Stat 206: the Multivariate Normal distribution

Stat 206: the Multivariate Normal distribution Stat 6: the Multivariate Normal distribution James Johndrow (adapted from Iain Johnstone s notes) 16-11- Introduction The multivariate normal distribution plays a central role in multivariate statistics

More information

A Probability Review

A Probability Review A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in

More information

Page 52. Lecture 3: Inner Product Spaces Dual Spaces, Dirac Notation, and Adjoints Date Revised: 2008/10/03 Date Given: 2008/10/03

Page 52. Lecture 3: Inner Product Spaces Dual Spaces, Dirac Notation, and Adjoints Date Revised: 2008/10/03 Date Given: 2008/10/03 Page 5 Lecture : Inner Product Spaces Dual Spaces, Dirac Notation, and Adjoints Date Revised: 008/10/0 Date Given: 008/10/0 Inner Product Spaces: Definitions Section. Mathematical Preliminaries: Inner

More information

Lecture Note 1: Probability Theory and Statistics

Lecture Note 1: Probability Theory and Statistics Univ. of Michigan - NAME 568/EECS 568/ROB 530 Winter 2018 Lecture Note 1: Probability Theory and Statistics Lecturer: Maani Ghaffari Jadidi Date: April 6, 2018 For this and all future notes, if you would

More information

Lecture Notes Part 2: Matrix Algebra

Lecture Notes Part 2: Matrix Algebra 17.874 Lecture Notes Part 2: Matrix Algebra 2. Matrix Algebra 2.1. Introduction: Design Matrices and Data Matrices Matrices are arrays of numbers. We encounter them in statistics in at least three di erent

More information

Recall that if X 1,...,X n are random variables with finite expectations, then. The X i can be continuous or discrete or of any other type.

Recall that if X 1,...,X n are random variables with finite expectations, then. The X i can be continuous or discrete or of any other type. Expectations of Sums of Random Variables STAT/MTHE 353: 4 - More on Expectations and Variances T. Linder Queen s University Winter 017 Recall that if X 1,...,X n are random variables with finite expectations,

More information

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace

More information

01 Probability Theory and Statistics Review

01 Probability Theory and Statistics Review NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement

More information

Mathematics for Graphics and Vision

Mathematics for Graphics and Vision Mathematics for Graphics and Vision Steven Mills March 3, 06 Contents Introduction 5 Scalars 6. Visualising Scalars........................ 6. Operations on Scalars...................... 6.3 A Note on

More information

Random Vectors, Random Matrices, and Matrix Expected Value

Random Vectors, Random Matrices, and Matrix Expected Value Random Vectors, Random Matrices, and Matrix Expected Value James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 16 Random Vectors,

More information

The Multivariate Gaussian Distribution

The Multivariate Gaussian Distribution The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance

More information

Vectors. January 13, 2013

Vectors. January 13, 2013 Vectors January 13, 2013 The simplest tensors are scalars, which are the measurable quantities of a theory, left invariant by symmetry transformations. By far the most common non-scalars are the vectors,

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

The Hilbert Space of Random Variables

The Hilbert Space of Random Variables The Hilbert Space of Random Variables Electrical Engineering 126 (UC Berkeley) Spring 2018 1 Outline Fix a probability space and consider the set H := {X : X is a real-valued random variable with E[X 2

More information

Basic Linear Algebra in MATLAB

Basic Linear Algebra in MATLAB Basic Linear Algebra in MATLAB 9.29 Optional Lecture 2 In the last optional lecture we learned the the basic type in MATLAB is a matrix of double precision floating point numbers. You learned a number

More information

REVIEW OF MAIN CONCEPTS AND FORMULAS A B = Ā B. Pr(A B C) = Pr(A) Pr(A B C) =Pr(A) Pr(B A) Pr(C A B)

REVIEW OF MAIN CONCEPTS AND FORMULAS A B = Ā B. Pr(A B C) = Pr(A) Pr(A B C) =Pr(A) Pr(B A) Pr(C A B) REVIEW OF MAIN CONCEPTS AND FORMULAS Boolean algebra of events (subsets of a sample space) DeMorgan s formula: A B = Ā B A B = Ā B The notion of conditional probability, and of mutual independence of two

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna

More information

TAMS39 Lecture 2 Multivariate normal distribution

TAMS39 Lecture 2 Multivariate normal distribution TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution

More information

MAS113 Introduction to Probability and Statistics. Proofs of theorems

MAS113 Introduction to Probability and Statistics. Proofs of theorems MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a

More information

5. Random Vectors. probabilities. characteristic function. cross correlation, cross covariance. Gaussian random vectors. functions of random vectors

5. Random Vectors. probabilities. characteristic function. cross correlation, cross covariance. Gaussian random vectors. functions of random vectors EE401 (Semester 1) 5. Random Vectors Jitkomut Songsiri probabilities characteristic function cross correlation, cross covariance Gaussian random vectors functions of random vectors 5-1 Random vectors we

More information

A Introduction to Matrix Algebra and the Multivariate Normal Distribution

A Introduction to Matrix Algebra and the Multivariate Normal Distribution A Introduction to Matrix Algebra and the Multivariate Normal Distribution PRE 905: Multivariate Analysis Spring 2014 Lecture 6 PRE 905: Lecture 7 Matrix Algebra and the MVN Distribution Today s Class An

More information

The Matrix Algebra of Sample Statistics

The Matrix Algebra of Sample Statistics The Matrix Algebra of Sample Statistics James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) The Matrix Algebra of Sample Statistics

More information

Distances and similarities Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Distances and similarities Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining Distances and similarities Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Similarities Start with X which we assume is centered and standardized. The PCA loadings were

More information

Image Registration Lecture 2: Vectors and Matrices

Image Registration Lecture 2: Vectors and Matrices Image Registration Lecture 2: Vectors and Matrices Prof. Charlene Tsai Lecture Overview Vectors Matrices Basics Orthogonal matrices Singular Value Decomposition (SVD) 2 1 Preliminary Comments Some of this

More information

3. Probability and Statistics

3. Probability and Statistics FE661 - Statistical Methods for Financial Engineering 3. Probability and Statistics Jitkomut Songsiri definitions, probability measures conditional expectations correlation and covariance some important

More information

The Multivariate Gaussian Distribution [DRAFT]

The Multivariate Gaussian Distribution [DRAFT] The Multivariate Gaussian Distribution DRAFT David S. Rosenberg Abstract This is a collection of a few key and standard results about multivariate Gaussian distributions. I have not included many proofs,

More information

Section 8.1. Vector Notation

Section 8.1. Vector Notation Section 8.1 Vector Notation Definition 8.1 Random Vector A random vector is a column vector X = [ X 1 ]. X n Each Xi is a random variable. Definition 8.2 Vector Sample Value A sample value of a random

More information

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows. Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage

More information

Lecture 14: Multivariate mgf s and chf s

Lecture 14: Multivariate mgf s and chf s Lecture 14: Multivariate mgf s and chf s Multivariate mgf and chf For an n-dimensional random vector X, its mgf is defined as M X (t) = E(e t X ), t R n and its chf is defined as φ X (t) = E(e ıt X ),

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Combinations of Variables Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

More information

Lecture 13: Simple Linear Regression in Matrix Format

Lecture 13: Simple Linear Regression in Matrix Format See updates and corrections at http://www.stat.cmu.edu/~cshalizi/mreg/ Lecture 13: Simple Linear Regression in Matrix Format 36-401, Section B, Fall 2015 13 October 2015 Contents 1 Least Squares in Matrix

More information

Stat 5101 Notes: Brand Name Distributions

Stat 5101 Notes: Brand Name Distributions Stat 5101 Notes: Brand Name Distributions Charles J. Geyer September 5, 2012 Contents 1 Discrete Uniform Distribution 2 2 General Discrete Uniform Distribution 2 3 Uniform Distribution 3 4 General Uniform

More information

Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

More information

STAT 501 Assignment 1 Name Spring Written Assignment: Due Monday, January 22, in class. Please write your answers on this assignment

STAT 501 Assignment 1 Name Spring Written Assignment: Due Monday, January 22, in class. Please write your answers on this assignment STAT 5 Assignment Name Spring Reading Assignment: Johnson and Wichern, Chapter, Sections.5 and.6, Chapter, and Chapter. Review matrix operations in Chapter and Supplement A. Examine the matrix properties

More information

Delta Method. Example : Method of Moments for Exponential Distribution. f(x; λ) = λe λx I(x > 0)

Delta Method. Example : Method of Moments for Exponential Distribution. f(x; λ) = λe λx I(x > 0) Delta Method Often estimators are functions of other random variables, for example in the method of moments. These functions of random variables can sometimes inherit a normal approximation from the underlying

More information

Stat 206: Estimation and testing for a mean vector,

Stat 206: Estimation and testing for a mean vector, Stat 206: Estimation and testing for a mean vector, Part II James Johndrow 2016-12-03 Comparing components of the mean vector In the last part, we talked about testing the hypothesis H 0 : µ 1 = µ 2 where

More information

Multivariate Distributions (Hogg Chapter Two)

Multivariate Distributions (Hogg Chapter Two) Multivariate Distributions (Hogg Chapter Two) STAT 45-1: Mathematical Statistics I Fall Semester 15 Contents 1 Multivariate Distributions 1 11 Random Vectors 111 Two Discrete Random Variables 11 Two Continuous

More information

LS.1 Review of Linear Algebra

LS.1 Review of Linear Algebra LS. LINEAR SYSTEMS LS.1 Review of Linear Algebra In these notes, we will investigate a way of handling a linear system of ODE s directly, instead of using elimination to reduce it to a single higher-order

More information

MATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS Solutions to Problems on Random Vectors and Random Sampling. 1+ x2 +y 2 ) (n+2)/2

MATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS Solutions to Problems on Random Vectors and Random Sampling. 1+ x2 +y 2 ) (n+2)/2 MATH 3806/MATH4806/MATH6806: MULTIVARIATE STATISTICS Solutions to Problems on Rom Vectors Rom Sampling Let X Y have the joint pdf: fx,y) + x +y ) n+)/ π n for < x < < y < this is particular case of the

More information

MAS113 Introduction to Probability and Statistics. Proofs of theorems

MAS113 Introduction to Probability and Statistics. Proofs of theorems MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a

More information

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. Rebecca Barter May 5, 2015 Linear Regression Review Linear Regression Review

More information

Today. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion

Today. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion Today Probability and Statistics Naïve Bayes Classification Linear Algebra Matrix Multiplication Matrix Inversion Calculus Vector Calculus Optimization Lagrange Multipliers 1 Classical Artificial Intelligence

More information

CS 246 Review of Linear Algebra 01/17/19

CS 246 Review of Linear Algebra 01/17/19 1 Linear algebra In this section we will discuss vectors and matrices. We denote the (i, j)th entry of a matrix A as A ij, and the ith entry of a vector as v i. 1.1 Vectors and vector operations A vector

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini April 27, 2018 1 / 1 Table of Contents 2 / 1 Linear Algebra Review Read 3.1 and 3.2 from text. 1. Fundamental subspace (rank-nullity, etc.) Im(X ) = ker(x T ) R

More information

EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University September 22, 2005 0 Preface This collection of ten

More information

Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as

Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as MAHALANOBIS DISTANCE Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as d E (x, y) = (x 1 y 1 ) 2 + +(x p y p ) 2

More information

Notation, Matrices, and Matrix Mathematics

Notation, Matrices, and Matrix Mathematics Geographic Information Analysis, Second Edition. David O Sullivan and David J. Unwin. 010 John Wiley & Sons, Inc. Published 010 by John Wiley & Sons, Inc. Appendix A Notation, Matrices, and Matrix Mathematics

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

The Transpose of a Vector

The Transpose of a Vector 8 CHAPTER Vectors The Transpose of a Vector We now consider the transpose of a vector in R n, which is a row vector. For a vector u 1 u. u n the transpose is denoted by u T = [ u 1 u u n ] EXAMPLE -5 Find

More information

Topics in Probability and Statistics

Topics in Probability and Statistics Topics in Probability and tatistics A Fundamental Construction uppose {, P } is a sample space (with probability P), and suppose X : R is a random variable. The distribution of X is the probability P X

More information

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline.

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline. MFM Practitioner Module: Risk & Asset Allocation September 11, 2013 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y

More information

Chapter 2. Linear Algebra. rather simple and learning them will eventually allow us to explain the strange results of

Chapter 2. Linear Algebra. rather simple and learning them will eventually allow us to explain the strange results of Chapter 2 Linear Algebra In this chapter, we study the formal structure that provides the background for quantum mechanics. The basic ideas of the mathematical machinery, linear algebra, are rather simple

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information

Linear Algebra Basics

Linear Algebra Basics Linear Algebra Basics For the next chapter, understanding matrices and how to do computations with them will be crucial. So, a good first place to start is perhaps What is a matrix? A matrix A is an array

More information

Exam 2. Jeremy Morris. March 23, 2006

Exam 2. Jeremy Morris. March 23, 2006 Exam Jeremy Morris March 3, 006 4. Consider a bivariate normal population with µ 0, µ, σ, σ and ρ.5. a Write out the bivariate normal density. The multivariate normal density is defined by the following

More information

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

[POLS 8500] Review of Linear Algebra, Probability and Information Theory [POLS 8500] Review of Linear Algebra, Probability and Information Theory Professor Jason Anastasopoulos ljanastas@uga.edu January 12, 2017 For today... Basic linear algebra. Basic probability. Programming

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

CS281A/Stat241A Lecture 17

CS281A/Stat241A Lecture 17 CS281A/Stat241A Lecture 17 p. 1/4 CS281A/Stat241A Lecture 17 Factor Analysis and State Space Models Peter Bartlett CS281A/Stat241A Lecture 17 p. 2/4 Key ideas of this lecture Factor Analysis. Recall: Gaussian

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition A Brief Mathematical Review Hamid R. Rabiee Jafar Muhammadi, Ali Jalali, Alireza Ghasemi Spring 2012 http://ce.sharif.edu/courses/90-91/2/ce725-1/ Agenda Probability theory

More information

This appendix provides a very basic introduction to linear algebra concepts.

This appendix provides a very basic introduction to linear algebra concepts. APPENDIX Basic Linear Algebra Concepts This appendix provides a very basic introduction to linear algebra concepts. Some of these concepts are intentionally presented here in a somewhat simplified (not

More information

Lecture 9 SLR in Matrix Form

Lecture 9 SLR in Matrix Form Lecture 9 SLR in Matrix Form STAT 51 Spring 011 Background Reading KNNL: Chapter 5 9-1 Topic Overview Matrix Equations for SLR Don t focus so much on the matrix arithmetic as on the form of the equations.

More information

Next is material on matrix rank. Please see the handout

Next is material on matrix rank. Please see the handout B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

MATH 320, WEEK 7: Matrices, Matrix Operations

MATH 320, WEEK 7: Matrices, Matrix Operations MATH 320, WEEK 7: Matrices, Matrix Operations 1 Matrices We have introduced ourselves to the notion of the grid-like coefficient matrix as a short-hand coefficient place-keeper for performing Gaussian

More information

Chp 4. Expectation and Variance

Chp 4. Expectation and Variance Chp 4. Expectation and Variance 1 Expectation In this chapter, we will introduce two objectives to directly reflect the properties of a random variable or vector, which are the Expectation and Variance.

More information

Chapter 2. Matrix Arithmetic. Chapter 2

Chapter 2. Matrix Arithmetic. Chapter 2 Matrix Arithmetic Matrix Addition and Subtraction Addition and subtraction act element-wise on matrices. In order for the addition/subtraction (A B) to be possible, the two matrices A and B must have the

More information

Lecture 2: Review of Basic Probability Theory

Lecture 2: Review of Basic Probability Theory ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent

More information

STA 2201/442 Assignment 2

STA 2201/442 Assignment 2 STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution

More information

An introduction to multivariate data

An introduction to multivariate data An introduction to multivariate data Angela Montanari 1 The data matrix The starting point of any analysis of multivariate data is a data matrix, i.e. a collection of n observations on a set of p characters

More information

v = ( 2)

v = ( 2) Chapter : Introduction to Vectors.. Vectors and linear combinations Let s begin by saying what vectors are: They are lists of numbers. If there are numbers in the list, there is a natural correspondence

More information

a11 a A = : a 21 a 22

a11 a A = : a 21 a 22 Matrices The study of linear systems is facilitated by introducing matrices. Matrix theory provides a convenient language and notation to express many of the ideas concisely, and complicated formulas are

More information

An Introduction to Matrix Algebra

An Introduction to Matrix Algebra An Introduction to Matrix Algebra EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #8 EPSY 905: Matrix Algebra In This Lecture An introduction to matrix algebra Ø Scalars, vectors, and matrices

More information

Multiple Random Variables

Multiple Random Variables Multiple Random Variables This Version: July 30, 2015 Multiple Random Variables 2 Now we consider models with more than one r.v. These are called multivariate models For instance: height and weight An

More information

The Multivariate Normal Distribution. In this case according to our theorem

The Multivariate Normal Distribution. In this case according to our theorem The Multivariate Normal Distribution Defn: Z R 1 N(0, 1) iff f Z (z) = 1 2π e z2 /2. Defn: Z R p MV N p (0, I) if and only if Z = (Z 1,..., Z p ) T with the Z i independent and each Z i N(0, 1). In this

More information

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n JOINT DENSITIES - RANDOM VECTORS - REVIEW Joint densities describe probability distributions of a random vector X: an n-dimensional vector of random variables, ie, X = (X 1,, X n ), where all X is are

More information

Review Packet 1 B 11 B 12 B 13 B = B 21 B 22 B 23 B 31 B 32 B 33 B 41 B 42 B 43

Review Packet 1 B 11 B 12 B 13 B = B 21 B 22 B 23 B 31 B 32 B 33 B 41 B 42 B 43 Review Packet. For each of the following, write the vector or matrix that is specified: a. e 3 R 4 b. D = diag{, 3, } c. e R 3 d. I. For each of the following matrices and vectors, give their dimension.

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

Principal Components Theory Notes

Principal Components Theory Notes Principal Components Theory Notes Charles J. Geyer August 29, 2007 1 Introduction These are class notes for Stat 5601 (nonparametrics) taught at the University of Minnesota, Spring 2006. This not a theory

More information

BIOS 2083 Linear Models Abdus S. Wahed. Chapter 2 84

BIOS 2083 Linear Models Abdus S. Wahed. Chapter 2 84 Chapter 2 84 Chapter 3 Random Vectors and Multivariate Normal Distributions 3.1 Random vectors Definition 3.1.1. Random vector. Random vectors are vectors of random variables. For instance, X = X 1 X 2.

More information

MULTIVARIATE PROBABILITY DISTRIBUTIONS

MULTIVARIATE PROBABILITY DISTRIBUTIONS MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined

More information

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA)

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) CS68: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) Tim Roughgarden & Gregory Valiant April 0, 05 Introduction. Lecture Goal Principal components analysis

More information

Whitening and Coloring Transformations for Multivariate Gaussian Data. A Slecture for ECE 662 by Maliha Hossain

Whitening and Coloring Transformations for Multivariate Gaussian Data. A Slecture for ECE 662 by Maliha Hossain Whitening and Coloring Transformations for Multivariate Gaussian Data A Slecture for ECE 662 by Maliha Hossain Introduction This slecture discusses how to whiten data that is normally distributed. Data

More information

MULTIVARIATE DISTRIBUTIONS

MULTIVARIATE DISTRIBUTIONS Chapter 9 MULTIVARIATE DISTRIBUTIONS John Wishart (1898-1956) British statistician. Wishart was an assistant to Pearson at University College and to Fisher at Rothamsted. In 1928 he derived the distribution

More information

Review of Linear Algebra

Review of Linear Algebra Review of Linear Algebra Definitions An m n (read "m by n") matrix, is a rectangular array of entries, where m is the number of rows and n the number of columns. 2 Definitions (Con t) A is square if m=

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information