Partitioned Covariance Matrices and Partial Correlations. Proposition 1 Let the (p + q) (p + q) covariance matrix C > 0 be partitioned as C = C11 C 12
|
|
- Solomon Payne
- 5 years ago
- Views:
Transcription
1 Partitioned Covariance Matrices and Partial Correlations Proposition 1 Let the (p + q (p + q covariance matrix C > 0 be partitioned as ( C11 C C = 12 C 21 C 22 Then the symmetric matrix C > 0 has the following partitioned form: ( C C C ( 12C22 C = where C22 C 21C C22 + C 22 C 21C C 12C22 C = C 11 C 12 C 22 C 21 Further, C > 0 is equivalent to the fact that both C 22 and C are regular (invertible matrices Proof Introduce the auxiliary matrix D = ( Ip C 12 C 22 O I q Note that D = 1, so D is regular With it, ( DCD T C O =, O C 22 from where and C = C C 22 ( C C = D T O that implies the required result O C 22 D Theorem 1 Let (X T 1, X T 2 T N p+q (µ, C be a random vector, where the expectation µ and the covariance matrix C are partitioned (with block sizes p and q in the following way: µ = ( µ1 µ 2, C = ( C11 C 12 C 21 C 22 Here C 11, C 22 are covariance matrices of X 1 and X 2, whereas C 12 = C T 21 is the cross-covariance matrix Then the conditional distribution of the random vector X 1 conditioned on X 2 = x 2 is N p (C 12 C 22 (x 2 µ 2 + µ 1, C distribution Proof Observe that it suffices to prove for the (X T 1, X T 2 T N p+q (0, C case The conditional distribution of X 1 conditioned on X 2 = x 2 is determined by the conditional pdf as follows: f(x 1 x 2 = (2π p+q 2 C 1 2 exp[ 1 2 (xt 1, x T 2 C (x T 1, x T 2 T ] (2π q 2 C exp[ 1 2 xt 2 C 22 x 2] 1 = C C C 12C22 C22 C 21C C22 C 21C11 C 12C22,
2 By Proposition 1, after writing C in the block-matrix form, and writing the quadratic form in the numerator as the sum of four terms, this simplifies to f(x 1 x 2 = (2π p 2 C 1 2 exp[ 1 2 (x 1 C 12 C 22 x 2 T C (x 1 C 12 C 22 x 2 which is a p-dimensional Gaussian density with the stated parameters Note that the conditional covariance matrix C does not depend on x 2 of the condition Further, for the conditional expectation, which is the expectation of the conditional distribution, we get that Therefore, E(X 1 X 2 = x 2 = C 12 C 22 (x 2 µ 2 + µ 1 E(X 1 X 2 = C 12 C 22 (X 2 µ 2 + µ 1 which is a linear function of the coordinates of X 2 In the p = q = 1 case, it is called regression line, while in the q = 1, p > 1 case, regression plane We will not deal with the q > 1, p > 1 case, which is the topic of the Canonical Correlation Analysis (CCA, Partial Least Squares Regression (PLS and the Structural Equation Modeling (SEM Summarizing, in case of the multidimensional Gaussian distribution, the regression functions are linear functions of the variables in the condition, which fact has important consequences in the multivariate statistical analysis Remark 1 Assume that E(X = 0 Then with the notation ε = X 1 E(X 1 X 2 = X 1 C 12 C22 X 2 we can decompose X 1 as the sum of E(X 1 X 2 and the error term ε By construction, E(ε = 0 and E(C 12 C22 X 2ε T = O Therefore, the covariance matrix of X 1 is decomposed as C 11 = C 12 C 22 C 21 + C, or else, the formula for total variances is applicable: Var(X 1 = Var(E(X 1 X 2 + E(Var(X 1 X 2 Theorem 2 Let X = (X 1,, X m T N m (0, C be a random vector, and let Γ := {1,, m} denote the index set of the variables, m 3 Assume that K = (k ij = C > 0 Then r XiX j X Γ\{i,j} = k ij kii k jj i j, where r XiX j X Γ\{i,j} denotes the partial correlation coefficient between X i and X j after eliminating the effect of the remaining variables X Γ\{i,j} Further, k ii = (Var(X i X Γ\{i}, i = 1, m is the reciprocal of the conditional variance of X i conditioned on the other variables X Γ\{i} Proof Without loss of generality we prove that r X1X 2 X Γ\{1,2} = k 12 k11 k 22 2
3 We use the result of Proposition 1 with p = 2, q = m 2, and partition C and K as in Figure (a and (b Then regressing X 1 with X Γ\{1,2} we get that and the error term is E(X 1 X Γ\{1,2} = c T 1 C 22 X Γ\{1,2} ε 1 = X 1 c T 1 C 22 X Γ\{1,2,} Likewise, regressing X 2 with X Γ\{1,2} we get that and the error term is By definition, E(X 2 X Γ\{1,2} = c T 2 C 22 X Γ\{1,2} ε 2 = X 2 c T 2 C 22 X Γ\{1,2} r X1X 2 X Γ\{1,2} = Corr(ε 1, ε 2 To find this correlation, we find the covariance and the variances of ε 1 and ε 2 As E(ε 1 = 0 and E(ε 2 = 0, with the notation of Figures (a and (b, Cov(ε 1, ε 2 = E(ε 1 ε T 2 = E(X 1 X T 2 c T 1 C 22 E(X Γ\{1,2}X T 2 E(X 1 X T Γ\{1,2} C 22 c 2 + c T 1 C 22 E(X Γ\{1,2}X T Γ\{1,2} C 22 c 2 = c 12 c T 1 C 22 c 2, where we used that E(X Γ\{1,2} X T Γ\{1,2} = C 22 Likewise, and Var(ε 1 = E(ε 1 ε T 1 = c 11 c T 1 C 22 c 1 Var(ε 2 = E(ε 2 ε T 2 = c 22 c T 2 C 22 c 2 Putting these together, and using that, by Theorem 2, the upper left 2 2 block of K is: K 11 = C, with the notation of Figure (c we get that r X1X 2 X Γ\{1,2} = b ad = b ad b 2 a d ad b 2 ad b 2 k 12 = k11 k 22 ( k11 k In the last equality we utilized that K 11 = 12 and k k 21 k 12 = k To prove the statement for k ii, it suffices to prove it for k 11 By Theorem 1, the conditional distribution of X 1 conditioned on X Γ\{1} is univariate N 1 ( c C X Γ\{1}, C distribution, where Var(X 1 X Γ\{1} = C = c 11 c T 12 C 22 c 12 = k 11 if we apply the result of Proposition 1 with p = 1, q = m 1, see Figure (d From Proposition 1 is also clear that C > 0 implies that k 11 0 This completes the proof 3
4 Definition 1 Assume X N m (0, C with C > 0 Consider the regression plane E(X 1 X Γ\{1} = x Γ\{1} = β 1j Γ\{1} x j, where x j s are the coordinates of x Γ\{1} Then we call the coefficient β 1j Γ\{1} jth partial regression coefficient for j Γ \ {1} The definition naturally extends to β ij Γ\{i} for j Γ \ {i} Theorem 3 β 1j Γ\{1} = k1j k 11 for j Γ \ {1} To prove the theorem, first we need a lemma Lemma 1 C 12 = O is equivalent to K 12 = O and K 11 K 12 = C 12 C 22 Proof of the Lemma By Proposition 1, an easy calculation shows that K11 K 12 = C K 12 = C ( C C 12C22 = C 12C22 that finishes the proof Proof of Theorem 3 With the help of Figure (d, the equation of the regression plane is E(X 1 X Γ\{1} = x Γ\{1} = c T 1 = C 22 x Γ\{1} = ( k11 K 12 j x j = ( c T 1 k 1j k 11 x j, C 22 jx j where in the last line we used the Lemma This finishes the proof Corollary 1 An important consequence of Theorem 3 is that β ij Γ\{i} = k ij kjj = r XiX k j X Γ\{i,j}, for j i ii k ii So only the variables X j s whose partial correlation with X i (after elimination the effect of the remaining variables is 0, enter into the regression of X i with the other variables We sometimes say that the partial regresion coeficient β ij Γ\{i} shows the change of X i under a unit change of X j keeping the other variables fixed, in latin wirds, ceterum paribus If we form a graph G on the vertex-set Γ and i j k ij 0, i j, then only the variables corresponding to vertices connected to vertex i enter into the regression of X i with the other variables This is called Gaussian graphical model and it is the base of the Path Analysis If the Gaussian graphical model is decomposable (see Graphical models in the Information Theory and Statistics course, then the maximal cliques, together with their separators (with multiplicities, form a so-called junction tree 4
5 structure Denote C the set of the maximal cliques and S the set of their separators in G Then the ML estimator of K can be calculated based on the S matrices (see the ML estimators of the multivariate normal parameters Let n be the sample size for the underlying m-variate normal distribution, Γ = {1,, m}, and assume that m < n For the maximal clique C C, let [S C ] Γ denote n times the empirical covariance matrix corresponding to the variables {X i : i C} complemented with zero entries to have an m m (symmetric, positive semidefinite matrix Likewise, for the separator S S, let [S S ] Γ denote n times the empirical covariance matrix corresponding to the variables {X i : i S} complemented with zero entries to have an m m (symmetric, positive semidefinite matrix Then the ML estimator of the mean vector is the sample average (as usual, while the ML estimator of the concentration matrix is } further, ˆK = n { [S C C C ˆK = n m ]Γ S S [S S ]Γ S S S S C C S C ; Testing hypothesis about partial correlations For i j we want to test H 0 : r XiX j X Γ\{i,j} = 0, ie, that X 1 and X 2 are contitionally independent conditioned on the remaining variables Equivalently, H 0 means that there is no edge in G between vertices i and j, or equivalently, β ij Γ\{i} = 0, β ji Γ\{j} = 0, or simply, k ij = k ji = 0 (C > 0 should be assumed To test H 0 in some form, several exact tests are known that are usually based on likelihood ratio tests or on Basu s theorem (see below The following test uses the empirical partial correlation coefficient, denoted by ˆr XiX j X Γ\{i,j}, and the following statistic based on it: B = 1 (ˆr XiX j X Γ\{i,j} 2 = S Γ\{i,j} S Γ S Γ\{i} S Γ\{j} It can be proven that under H 0, the test statistic t = 1 n m B 1 = ˆr XiX j X Γ\{i,j} n m 1 (ˆr XiX j X 2 Γ\{i,j} is distributed as Student s t with n m degrees of freedom Therefore, we reject H 0 for large values of t Theorem 4 (Basu If T = T (X is a sufficient and complete statistic with respect to the family P of distributions, and the distribution of Y = s(x does not depend on P P, then T and Y are independent 5
6 Figure 1: Figures about convenient block partitions of C and K 6
Multivariate Regression
Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationDecomposable and Directed Graphical Gaussian Models
Decomposable Decomposable and Directed Graphical Gaussian Models Graphical Models and Inference, Lecture 13, Michaelmas Term 2009 November 26, 2009 Decomposable Definition Basic properties Wishart density
More informationMultivariate Gaussians. Sargur Srihari
Multivariate Gaussians Sargur srihari@cedar.buffalo.edu 1 Topics 1. Multivariate Gaussian: Basic Parameterization 2. Covariance and Information Form 3. Operations on Gaussians 4. Independencies in Gaussians
More informationTime Series Analysis
Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture Regression based methods, 1st part: Introduction (Sec.
More informationx. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).
.8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix
More informationNotes on the Multivariate Normal and Related Topics
Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions
More informationMultivariate Random Variable
Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate
More informationMultivariate Regression (Chapter 10)
Multivariate Regression (Chapter 10) This week we ll cover multivariate regression and maybe a bit of canonical correlation. Today we ll mostly review univariate multivariate regression. With multivariate
More informationThe Multivariate Gaussian Distribution
The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance
More informationGaussian Graphical Models and Graphical Lasso
ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf
More information1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation
1 Outline. 1. Motivation 2. SUR model 3. Simultaneous equations 4. Estimation 2 Motivation. In this chapter, we will study simultaneous systems of econometric equations. Systems of simultaneous equations
More informationLecture 20: Linear model, the LSE, and UMVUE
Lecture 20: Linear model, the LSE, and UMVUE Linear Models One of the most useful statistical models is X i = β τ Z i + ε i, i = 1,...,n, where X i is the ith observation and is often called the ith response;
More informationCS 195-5: Machine Learning Problem Set 1
CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of
More informationLecture 11. Multivariate Normal theory
10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini April 27, 2018 1 / 1 Table of Contents 2 / 1 Linear Algebra Review Read 3.1 and 3.2 from text. 1. Fundamental subspace (rank-nullity, etc.) Im(X ) = ker(x T ) R
More informationMATH 829: Introduction to Data Mining and Analysis Graphical Models I
MATH 829: Introduction to Data Mining and Analysis Graphical Models I Dominique Guillot Departments of Mathematical Sciences University of Delaware May 2, 2016 1/12 Independence and conditional independence:
More informationRegression. Oscar García
Regression Oscar García Regression methods are fundamental in Forest Mensuration For a more concise and general presentation, we shall first review some matrix concepts 1 Matrices An order n m matrix is
More informationNew Introduction to Multiple Time Series Analysis
Helmut Lütkepohl New Introduction to Multiple Time Series Analysis With 49 Figures and 36 Tables Springer Contents 1 Introduction 1 1.1 Objectives of Analyzing Multiple Time Series 1 1.2 Some Basics 2
More informationLikelihood Analysis of Gaussian Graphical Models
Faculty of Science Likelihood Analysis of Gaussian Graphical Models Ste en Lauritzen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 2 Slide 1/43 Overview of lectures Lecture 1 Markov Properties
More informationGaussian Models (9/9/13)
STA561: Probabilistic machine learning Gaussian Models (9/9/13) Lecturer: Barbara Engelhardt Scribes: Xi He, Jiangwei Pan, Ali Razeen, Animesh Srivastava 1 Multivariate Normal Distribution The multivariate
More informationRandom Variables and Their Distributions
Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital
More informationResearch Reports on Mathematical and Computing Sciences
ISSN 1342-284 Research Reports on Mathematical and Computing Sciences Exploiting Sparsity in Linear and Nonlinear Matrix Inequalities via Positive Semidefinite Matrix Completion Sunyoung Kim, Masakazu
More informationA note on profile likelihood for exponential tilt mixture models
Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential
More information1.1 Limits and Continuity. Precise definition of a limit and limit laws. Squeeze Theorem. Intermediate Value Theorem. Extreme Value Theorem.
STATE EXAM MATHEMATICS Variant A ANSWERS AND SOLUTIONS 1 1.1 Limits and Continuity. Precise definition of a limit and limit laws. Squeeze Theorem. Intermediate Value Theorem. Extreme Value Theorem. Definition
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationWalk-Sum Interpretation and Analysis of Gaussian Belief Propagation
Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Jason K. Johnson, Dmitry M. Malioutov and Alan S. Willsky Department of Electrical Engineering and Computer Science Massachusetts Institute
More informationUndirected Graphical Models
Undirected Graphical Models 1 Conditional Independence Graphs Let G = (V, E) be an undirected graph with vertex set V and edge set E, and let A, B, and C be subsets of vertices. We say that C separates
More informationMultiple Random Variables
Multiple Random Variables This Version: July 30, 2015 Multiple Random Variables 2 Now we consider models with more than one r.v. These are called multivariate models For instance: height and weight An
More informationNotes on Random Vectors and Multivariate Normal
MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationWeighted Least Squares
Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w
More informationConsistent Bivariate Distribution
A Characterization of the Normal Conditional Distributions MATSUNO 79 Therefore, the function ( ) = G( : a/(1 b2)) = N(0, a/(1 b2)) is a solu- tion for the integral equation (10). The constant times of
More informationMachine Learning (CS 567) Lecture 5
Machine Learning (CS 567) Lecture 5 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol
More informationMassachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Problem Set 3 Issued: Thursday, September 25, 2014 Due: Thursday,
More informationChapter 4: Factor Analysis
Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.
More informationChapter 17: Undirected Graphical Models
Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)
More informationSummary of Chapters 7-9
Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two
More informationLinear Regression Linear Regression with Shrinkage
Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationCointegrated VAR s. Eduardo Rossi University of Pavia. November Rossi Cointegrated VAR s Financial Econometrics / 56
Cointegrated VAR s Eduardo Rossi University of Pavia November 2013 Rossi Cointegrated VAR s Financial Econometrics - 2013 1 / 56 VAR y t = (y 1t,..., y nt ) is (n 1) vector. y t VAR(p): Φ(L)y t = ɛ t The
More information6-1. Canonical Correlation Analysis
6-1. Canonical Correlation Analysis Canonical Correlatin analysis focuses on the correlation between a linear combination of the variable in one set and a linear combination of the variables in another
More informationTAMS39 Lecture 2 Multivariate normal distribution
TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution
More informationMa 3/103: Lecture 24 Linear Regression I: Estimation
Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft
More informationThe Gaussian distribution
The Gaussian distribution Probability density function: A continuous probability density function, px), satisfies the following properties:. The probability that x is between two points a and b b P a
More informationThe Maximum Likelihood Threshold of a Graph
The Maximum Likelihood Threshold of a Graph Elizabeth Gross and Seth Sullivant San Jose State University, North Carolina State University August 28, 2014 Seth Sullivant (NCSU) Maximum Likelihood Threshold
More informationExam 2. Jeremy Morris. March 23, 2006
Exam Jeremy Morris March 3, 006 4. Consider a bivariate normal population with µ 0, µ, σ, σ and ρ.5. a Write out the bivariate normal density. The multivariate normal density is defined by the following
More informationCanonical Correlations
Canonical Correlations Like Principal Components Analysis, Canonical Correlation Analysis looks for interesting linear combinations of multivariate observations. In Canonical Correlation Analysis, a multivariate
More information1 Data Arrays and Decompositions
1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 12: Gaussian Belief Propagation, State Space Models and Kalman Filters Guest Kalman Filter Lecture by
More informationChapter 5 Matrix Approach to Simple Linear Regression
STAT 525 SPRING 2018 Chapter 5 Matrix Approach to Simple Linear Regression Professor Min Zhang Matrix Collection of elements arranged in rows and columns Elements will be numbers or symbols For example:
More information5.2 Expounding on the Admissibility of Shrinkage Estimators
STAT 383C: Statistical Modeling I Fall 2015 Lecture 5 September 15 Lecturer: Purnamrita Sarkar Scribe: Ryan O Donnell Disclaimer: These scribe notes have been slightly proofread and may have typos etc
More informationp(z)
Chapter Statistics. Introduction This lecture is a quick review of basic statistical concepts; probabilities, mean, variance, covariance, correlation, linear regression, probability density functions and
More informationRegression diagnostics
Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:
More information2. Matrix Algebra and Random Vectors
2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns
More informationMultivariate Gaussian Distribution. Auxiliary notes for Time Series Analysis SF2943. Spring 2013
Multivariate Gaussian Distribution Auxiliary notes for Time Series Analysis SF2943 Spring 203 Timo Koski Department of Mathematics KTH Royal Institute of Technology, Stockholm 2 Chapter Gaussian Vectors.
More informationTest Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics
Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests
More informationMultivariate Statistics
Multivariate Statistics Chapter 2: Multivariate distributions and inference Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2016/2017 Master in Mathematical
More informationSTT 843 Key to Homework 1 Spring 2018
STT 843 Key to Homework Spring 208 Due date: Feb 4, 208 42 (a Because σ = 2, σ 22 = and ρ 2 = 05, we have σ 2 = ρ 2 σ σ22 = 2/2 Then, the mean and covariance of the bivariate normal is µ = ( 0 2 and Σ
More informationThe Multivariate Gaussian Distribution [DRAFT]
The Multivariate Gaussian Distribution DRAFT David S. Rosenberg Abstract This is a collection of a few key and standard results about multivariate Gaussian distributions. I have not included many proofs,
More informationBasic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance:
8. PROPERTIES OF LEAST SQUARES ESTIMATES 1 Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = 0. 2. The errors are uncorrelated with common variance: These assumptions
More informationMultivariate Distributions
IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate
More informationNext tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution. e z2 /2. f Z (z) = 1 2π. e z2 i /2
Next tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution Defn: Z R 1 N(0,1) iff f Z (z) = 1 2π e z2 /2 Defn: Z R p MV N p (0, I) if and only if Z = (Z 1,..., Z p ) (a column
More information6.3 Forecasting ARMA processes
6.3. FORECASTING ARMA PROCESSES 123 6.3 Forecasting ARMA processes The purpose of forecasting is to predict future values of a TS based on the data collected to the present. In this section we will discuss
More informationGenerative classifiers: The Gaussian classifier. Ata Kaban School of Computer Science University of Birmingham
Generative classifiers: The Gaussian classifier Ata Kaban School of Computer Science University of Birmingham Outline We have already seen how Bayes rule can be turned into a classifier In all our examples
More informationCHAPTER 1. Relations. 1. Relations and Their Properties. Discussion
CHAPTER 1 Relations 1. Relations and Their Properties 1.1. Definition of a Relation. Definition 1.1.1. A binary relation from a set A to a set B is a subset R A B. If (a, b) R we say a is Related to b
More informationIntroduction to Graphical Models
Introduction to Graphical Models The 15 th Winter School of Statistical Physics POSCO International Center & POSTECH, Pohang 2018. 1. 9 (Tue.) Yung-Kyun Noh GENERALIZATION FOR PREDICTION 2 Probabilistic
More informationEconomics 620, Lecture 5: exp
1 Economics 620, Lecture 5: The K-Variable Linear Model II Third assumption (Normality): y; q(x; 2 I N ) 1 ) p(y) = (2 2 ) exp (N=2) 1 2 2(y X)0 (y X) where N is the sample size. The log likelihood function
More informationProblems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B
Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2
More informationMATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS Solutions to Problems on Random Vectors and Random Sampling. 1+ x2 +y 2 ) (n+2)/2
MATH 3806/MATH4806/MATH6806: MULTIVARIATE STATISTICS Solutions to Problems on Rom Vectors Rom Sampling Let X Y have the joint pdf: fx,y) + x +y ) n+)/ π n for < x < < y < this is particular case of the
More informationUC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory
UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics EECS 281A / STAT 241A Statistical Learning Theory Solutions to Problem Set 2 Fall 2011 Issued: Wednesday,
More informationAn Introduction to Exponential-Family Random Graph Models
An Introduction to Exponential-Family Random Graph Models Luo Lu Feb.8, 2011 1 / 11 Types of complications in social network Single relationship data A single relationship observed on a set of nodes at
More informationFisher Information in Gaussian Graphical Models
Fisher Information in Gaussian Graphical Models Jason K. Johnson September 21, 2006 Abstract This note summarizes various derivations, formulas and computational algorithms relevant to the Fisher information
More informationThe generative approach to classification. A classification problem. Generative models CSE 250B
The generative approach to classification The generative approach to classification CSE 250B The learning process: Fit a probability distribution to each class, individually To classify a new point: Which
More informationMixed-Models. version 30 October 2011
Mixed-Models version 30 October 2011 Mixed models Mixed models estimate a vector! of fixed effects and one (or more) vectors u of random effects Both fixed and random effects models always include a vector
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationMachine learning - HT Maximum Likelihood
Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce
More informationf(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain
0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher
More informationLecture 15. Hypothesis testing in the linear model
14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma
More informationSequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them
HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated
More informationComputer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)
Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori
More informationEstimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators
Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let
More informationMinimum Error Rate Classification
Minimum Error Rate Classification Dr. K.Vijayarekha Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur-613 401 Table of Contents 1.Minimum Error Rate Classification...
More informationCopula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011
Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011 Outline Ordinary Least Squares (OLS) Regression Generalized Linear Models
More information3. Probability and Statistics
FE661 - Statistical Methods for Financial Engineering 3. Probability and Statistics Jitkomut Songsiri definitions, probability measures conditional expectations correlation and covariance some important
More informationSemidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5
Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5 Instructor: Farid Alizadeh Scribe: Anton Riabov 10/08/2001 1 Overview We continue studying the maximum eigenvalue SDP, and generalize
More informationThe doubly negative matrix completion problem
The doubly negative matrix completion problem C Mendes Araújo, Juan R Torregrosa and Ana M Urbano CMAT - Centro de Matemática / Dpto de Matemática Aplicada Universidade do Minho / Universidad Politécnica
More informationExample: multivariate Gaussian Distribution
School of omputer Science Probabilistic Graphical Models Representation of undirected GM (continued) Eric Xing Lecture 3, September 16, 2009 Reading: KF-chap4 Eric Xing @ MU, 2005-2009 1 Example: multivariate
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationLecture 2 Part 1 Optimization
Lecture 2 Part 1 Optimization (January 16, 2015) Mu Zhu University of Waterloo Need for Optimization E(y x), P(y x) want to go after them first, model some examples last week then, estimate didn t discuss
More informationGARCH Models Estimation and Inference
GARCH Models Estimation and Inference Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 1 Likelihood function The procedure most often used in estimating θ 0 in
More informationMAS223 Statistical Inference and Modelling Exercises
MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,
More informationConvergence Analysis of the Variance in. Gaussian Belief Propagation
0.09/TSP.204.2345635, IEEE Transactions on Signal Processing Convergence Analysis of the Variance in Gaussian Belief Propagation Qinliang Su and Yik-Chung Wu Abstract It is known that Gaussian belief propagation
More information1 Positive definiteness and semidefiniteness
Positive definiteness and semidefiniteness Zdeněk Dvořák May 9, 205 For integers a, b, and c, let D(a, b, c) be the diagonal matrix with + for i =,..., a, D i,i = for i = a +,..., a + b,. 0 for i = a +
More informationBayesian Linear Regression [DRAFT - In Progress]
Bayesian Linear Regression [DRAFT - In Progress] David S. Rosenberg Abstract Here we develop some basics of Bayesian linear regression. Most of the calculations for this document come from the basic theory
More informationNegative binomial distribution and multiplicities in p p( p) collisions
Negative binomial distribution and multiplicities in p p( p) collisions Institute of Theoretical Physics University of Wroc law Zakopane June 12, 2011 Summary s are performed for the hypothesis that charged-particle
More information6-1 Study Guide and Intervention Multivariable Linear Systems and Row Operations
6-1 Study Guide and Intervention Multivariable Linear Systems and Row Operations Gaussian Elimination You can solve a system of linear equations using matrices. Solving a system by transforming it into
More information