Maximum Likelihood Estimation
|
|
- Joel Baker
- 5 years ago
- Views:
Transcription
1 Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017
2 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter 1-2, Appendix A, and Appendix B
3 Models Take an random vector Y R n which is observable and decompose Y = µ + ɛ into µ R n (unknown, fixed) and ɛ R n unobservable error vector (random) Usual assumptions? E[ɛ i ] = 0 i E[ɛ] = 0 E[Y] = µ (mean vector) ɛ i independent with Var(ɛ i ) = σ 2 and Cov(ɛ i, ɛ j ) = 0 Matrix version Cov[ɛ] [(E [ɛ i E[ɛ i ]])(E [ɛ j E[ɛ j ]])] ij = σ 2 I n Cov[Y] = σ 2 I n (errors are uncorrelated) ɛ i iid N(0, σ 2 ) implies that Y i ind N(µ i, σ 2 )
4 Likelihood Functions The likelihood function for µ, σ 2 is proportional to the sampling distribution of the data L(µ, σ 2 ) n i=1 1 (2πσ 2 ) exp 1 2 (2πσ 2 ) n/2 exp { (σ 2 ) n/2 exp 1 2 { (σ 2 ) n/2 exp 1 2 { 1 2 (2π) n/2 I n σ 2 1/2 exp Last line is the density of Y N n ( µ, σ 2 I n ) { (yi µ i ) 2 } σ 2 i (Y i µ i ) 2 ) σ 2 } (Y µ) T } (Y µ) σ 2 Y µ 2 } σ 2 { 1 2 Y µ 2 } σ 2
5 MLEs Find values of ˆµ and ˆσ 2 that maximize the likelihood L(µ, σ 2 ) for µ R n and σ 2 R + { L(µ, σ 2 ) (σ 2 ) n/2 exp 1 Y µ 2 } 2 σ 2 log(l(µ, σ 2 )) n 2 log(σ2 ) 1 Y µ 2 2 σ 2 or equivalently the log likelihood Clearly, ˆµ = Y but ˆσ 2 = 0 is outside the parameter space Need restrictions on µ = Xβ
6 Column Space Let X 1, X 2,..., X p R n The set of all linear combinations of X 1,..., X p is the space spanned by X 1,..., X p S(X 1,..., X p ) Let X = [X 1 X 2... X p ] be a n p matrix with columns X j then the column space of X, C(X) = S(X 1,..., X p ) space spanned by the (column) vectors of X µ C(X) : C(X) = {µ µ R n such that Xβ = µ for some β R p } (also called the Range of X, R(X)) β are the coordinates of µ in this space C(X) is a subspace of R n Many equivalent ways to represent the same mean vector inference should be independent of the coordinate system used
7 Projections µ = Xβ with X full rank µ C(X) P X = X(X T X) 1 X T P X is the orthogonal projection operator on the column space of X; e.g. P = P 2 idempotent (projection) P 2 X = P XP X = X(X T X) 1 X T X(X T X) 1 X T = X(X T X) 1 X T = P X P = P T symmetry (orthogonal) P T X = (X(XT X) 1 X T ) T = (X T ) T ((X T X) 1 ) T (X) T = X(X T X) 1 X T = P X P X µ = P X Xβ = X(X T X) 1 X T Xβ = Xβ = µ
8 Projections Claim: I P X is an orthogonal projection onto C(X) idempotent (I P X ) 2 = (I P X )(I P X ) = I P X P X + P X P X = I P X P X + P X = I P X Symmetry I P X = (I P X ) T u C(X) u C(X) that is u C(X) and v C(X) then u T v = 0 (I P X )u = u (projection) if v C(X), (I P X )v = v v = 0
9 Log Likelihood Y N(µ, σ 2 I n ) with µ = Xβ and X full column rank Claim: Maximum Likelihood Estimator (MLE) of µ is P X Y Log Likelihood: log L(µ, σ 2 ) = n 2 log(σ2 ) 1 Y µ 2 2 σ 2 Decompose Y = P X Y + (I P X )Y Use P X µ = µ Simplify Y µ 2
10 Expand Y µ 2 = (I P X )Y + P x Y µ 2 = (I P X )Y + P x Y P X µ 2 = (I P x )Y + P X (Y µ) 2 = (I P x )Y 2 + P X (Y µ) 2 + 2(Y µ) T P T X (I P X )Y = (I P x )Y 2 + P X (Y µ) = (I P x )Y 2 + P X Y µ 2 Crossproduct term is zero P T X (I P X) = P X (I P X ) = P X P X P X = P X P X = 0
11 Likelihood Substitute decomposition into log likelihood log L(µ, σ 2 ) = n 2 log(σ2 ) 1 Y µ 2 2 σ 2 = n 2 log(σ2 ) 1 ( (I PX )Y 2 2 σ 2 + P XY µ 2 ) σ 2 = n 2 log(σ2 ) 1 (I P X )Y 2 }{{ 2 σ P X Y µ 2 }} 2 {{ σ 2 } = constant with respect to µ 0 Maximize with respect to µ for each σ 2 RHS is largest when µ = P X Y for any choice of σ 2 ˆµ = P X Y is the MLE of µ (yields fitted values Ŷ = P X Y)
12 MLE of β L(µ, σ 2 ) = n 2 log(σ2 ) 1 ( (I PX )Y 2 2 σ 2 + P XY µ 2 ) σ 2 L(β, σ 2 ) = n 2 log(σ2 ) 1 ( (I PX )Y 2 2 σ 2 + P XY Xβ 2 ) σ 2 Similar argument to show that RHS is maximized by minimizing P X Y Xβ 2 Therefore ˆβ is a MLE of β if and only if satisfies P X Y = Xˆβ If X T X is full rank, the MLE of β is (X T X) 1 X T Y = ˆβ
13 MLE of σ 2 Plug-in MLE of ˆµ for µ and differentiate with respect to σ 2 log L(ˆµ, σ 2 ) = n 2 log σ2 1 (I P X )Y 2 2 σ 2 log L(ˆµ, σ 2 ) σ 2 = n 2 1 σ (I P X)Y 2 Set derivative to zero and solve for MLE 0 = n 1 2 ˆσ ( ) (I P X)Y 2 ˆσ 2 n 2 ˆσ2 = 1 2 (I P X)Y 2 ˆσ 2 = (I P X)Y 2 n ( ) 1 2 σ 2
14 Estimate of σ 2 Maximum Likelihood Estimate of σ 2 ˆσ 2 = (I P X)Y 2 n = YT (I P X ) T (I P X )Y n = YT (I P X )Y n = et e n where e = (I P X )Y residuals from the regression of Y on X
15 Geometric View Fitted Values Ŷ = P X Y = Xˆβ Residuals e = (I P X )Y Y = Ŷ + e Y 2 = P X Y 2 + (I P X )Y 2
16 Properties Ŷ = ˆµ is an unbiased estimate of µ = Xβ E[Ŷ] = E[P XY] = P X E[Y] = P X µ = µ E[e] = 0 if µ C(X) E[e] = E[(I P X )Y] = (I P X )E[Y] = (I P X )µ = 0 Will not be 0 if µ / C(X) (useful for model checking)
17 Estimate of σ 2 MLE of σ 2 : ˆσ 2 = et e n = YT (I P X )Y n Is this an unbiased estimate of σ 2? Need expectations of quadratic forms Y T AY for A an n n matrix Y a random vector in R n
18 Quadratic Forms Without loss of generality we can assume that A = A T Y T AY is a scalar Y T AY = (Y T AY) T = Y T A T Y may take A = A T Y T AY + Y T A T Y 2 = Y T AY Y T (A + AT ) Y = Y T AY 2
19 Expectations of Quadratic Forms Theorem Let Y be a random vector in R n with E[Y] = µ and Cov(Y) = Σ. Then E[Y T AY] = traσ + µ T Aµ. Result useful for finding expected values of Mean Squares; no normality required!
20 Proof Start with (Y µ) T A(Y µ), expand and take expectations E[(Y µ) T A(Y µ)] = E[Y T AY + µ T Aµ µ T AY Y T Aµ] = E[Y T AY] + µ T Aµ µ T Aµ µ T Aµ = E[Y T AY] µ T Aµ Rearrange E[Y T AY] = E[(Y µ) T A(Y µ)] + µ T Aµ = E[tr(Y µ) T A(Y µ)] + µ T Aµ = E[trA(Y µ)(y µ) T ] + µ T Aµ = tre[a(y µ)(y µ) T ] + µ T Aµ = trae([(y µ)(y µ) T ] + µ T Aµ = traσ + µ T Aµ tra n i=1 a ii
21 Expectation of ˆσ 2 Use the theorem: E[Y T (I P X )Y] = tr(i P X )σ 2 I + µ T (I P X )µ = σ 2 tr(i P X ) = σ 2 r(i P X ) = σ 2 (n r(x)) Therefore an unbiased estimate of σ 2 is e T e n r(x) If X is full rank (r(x) = p) and P X = X(X T X) 1 X T then the tr(p X ) = tr(x(x T X) 1 X T ) = tr(x T X(X T X) 1 ) = tr(i p ) = p
22 Spectral Theorem Theorem If A (n n) is a symmetric real matrix then there exists a U (n n) such that U T U = UU T = I n and a diagonal matrix Λ with elements λ i such that A = UΛU T U is an orthogonal matrix; U 1 = U T The columns of U from an Orthonormal Basis for R n rank of A equals the number of non-zero eigenvalues λ i Columns of U associated with non-zero eigenvalues form an ONB for C(A) (eigenvectors of A) A p = UΛ p U T (matrix powers) a square root of A > 0 is UΛ 1/2 U T
23 Projections Projection Matrix If P is an orthogonal projection matrix, then its eigenvalues λ i are either zero or one with tr(p) = i (λ i) = r(p) P = UΛU T P = P 2 UΛU T UΛU T = UΛ 2 U T Λ = Λ 2 is true only for λ i = 1 or λ i = 0 Since r(p) is the number of non-zero eigenvalues, r(p) = λ i = tr(p) [ ] [ ] Ir 0 U T P = [U P U P ] P 0 0 n r U T = U P U T P P r P = u i u T i sum of r rank 1 projections. i=1
24 Next Class distribution theory Continue Reading Chapter 1-2 and Appendices A & B in Christensen
MLES & Multivariate Normal Theory
Merlise Clyde September 6, 2016 Outline Expectations of Quadratic Forms Distribution Linear Transformations Distribution of estimates under normality Properties of MLE s Recap Ŷ = ˆµ is an unbiased estimate
More information3. For a given dataset and linear model, what do you think is true about least squares estimates? Is Ŷ always unique? Yes. Is ˆβ always unique? No.
7. LEAST SQUARES ESTIMATION 1 EXERCISE: Least-Squares Estimation and Uniqueness of Estimates 1. For n real numbers a 1,...,a n, what value of a minimizes the sum of squared distances from a to each of
More informationSTAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.
STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. Rebecca Barter May 5, 2015 Linear Regression Review Linear Regression Review
More informationSampling Distributions
Merlise Clyde Duke University September 8, 2016 Outline Topics Normal Theory Chi-squared Distributions Student t Distributions Readings: Christensen Apendix C, Chapter 1-2 Prostate Example > library(lasso2);
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini April 27, 2018 1 / 1 Table of Contents 2 / 1 Linear Algebra Review Read 3.1 and 3.2 from text. 1. Fundamental subspace (rank-nullity, etc.) Im(X ) = ker(x T ) R
More informationSampling Distributions
Merlise Clyde Duke University September 3, 2015 Outline Topics Normal Theory Chi-squared Distributions Student t Distributions Readings: Christensen Apendix C, Chapter 1-2 Prostate Example > library(lasso2);
More informationXβ is a linear combination of the columns of X: Copyright c 2010 Dan Nettleton (Iowa State University) Statistics / 25 X =
The Gauss-Markov Linear Model y Xβ + ɛ y is an n random vector of responses X is an n p matrix of constants with columns corresponding to explanatory variables X is sometimes referred to as the design
More informationRegression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin
Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n
More informationLecture 15. Hypothesis testing in the linear model
14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma
More informationMatrix Approach to Simple Linear Regression: An Overview
Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix
More informationLinear Models Review
Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign
More information5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.
88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationGauss Markov & Predictive Distributions
Gauss Markov & Predictive Distributions Merlise Clyde STA721 Linear Models Duke University September 14, 2017 Outline Topics Gauss-Markov Theorem Estimability and Prediction Readings: Christensen Chapter
More informationProperties of Matrices and Operations on Matrices
Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,
More informationEstimation of the Response Mean. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 27
Estimation of the Response Mean Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 / 27 The Gauss-Markov Linear Model y = Xβ + ɛ y is an n random vector of responses. X is an n p matrix
More informationMath 423/533: The Main Theoretical Topics
Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)
More informationIntroduction to Estimation Methods for Time Series models. Lecture 1
Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation
More informationPeter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8
Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall
More informationCh4. Distribution of Quadratic Forms in y
ST4233, Linear Models, Semester 1 2008-2009 Ch4. Distribution of Quadratic Forms in y 1 Definition Definition 1.1 If A is a symmetric matrix and y is a vector, the product y Ay = i a ii y 2 i + i j a ij
More informationANOVA: Analysis of Variance - Part I
ANOVA: Analysis of Variance - Part I The purpose of these notes is to discuss the theory behind the analysis of variance. It is a summary of the definitions and results presented in class with a few exercises.
More informationLinear models. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. October 5, 2016
Linear models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark October 5, 2016 1 / 16 Outline for today linear models least squares estimation orthogonal projections estimation
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix
More informationRestricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model
Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives
More informationChapter 1. Linear Regression with One Predictor Variable
Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical
More informationOrthogonal Projection and Least Squares Prof. Philip Pennance 1 -Version: December 12, 2016
Orthogonal Projection and Least Squares Prof. Philip Pennance 1 -Version: December 12, 2016 1. Let V be a vector space. A linear transformation P : V V is called a projection if it is idempotent. That
More information2018 2019 1 9 sei@mistiu-tokyoacjp http://wwwstattu-tokyoacjp/~sei/lec-jhtml 11 552 3 0 1 2 3 4 5 6 7 13 14 33 4 1 4 4 2 1 1 2 2 1 1 12 13 R?boxplot boxplotstats which does the computation?boxplotstats
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationLecture 11: Regression Methods I (Linear Regression)
Lecture 11: Regression Methods I (Linear Regression) Fall, 2017 1 / 40 Outline Linear Model Introduction 1 Regression: Supervised Learning with Continuous Responses 2 Linear Models and Multiple Linear
More information3 Multiple Linear Regression
3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is
More informationLecture 11: Regression Methods I (Linear Regression)
Lecture 11: Regression Methods I (Linear Regression) 1 / 43 Outline 1 Regression: Supervised Learning with Continuous Responses 2 Linear Models and Multiple Linear Regression Ordinary Least Squares Statistical
More informationStatistics 910, #5 1. Regression Methods
Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationMIT Spring 2015
Regression Analysis MIT 18.472 Dr. Kempthorne Spring 2015 1 Outline Regression Analysis 1 Regression Analysis 2 Multiple Linear Regression: Setup Data Set n cases i = 1, 2,..., n 1 Response (dependent)
More informationMAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik
MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix
More information2. A Review of Some Key Linear Models Results. Copyright c 2018 Dan Nettleton (Iowa State University) 2. Statistics / 28
2. A Review of Some Key Linear Models Results Copyright c 2018 Dan Nettleton (Iowa State University) 2. Statistics 510 1 / 28 A General Linear Model (GLM) Suppose y = Xβ + ɛ, where y R n is the response
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationThe Hilbert Space of Random Variables
The Hilbert Space of Random Variables Electrical Engineering 126 (UC Berkeley) Spring 2018 1 Outline Fix a probability space and consider the set H := {X : X is a real-valued random variable with E[X 2
More informationFor more information about how to cite these materials visit
Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationBasic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance:
8. PROPERTIES OF LEAST SQUARES ESTIMATES 1 Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = 0. 2. The errors are uncorrelated with common variance: These assumptions
More informationMatrices and Multivariate Statistics - II
Matrices and Multivariate Statistics - II Richard Mott November 2011 Multivariate Random Variables Consider a set of dependent random variables z = (z 1,..., z n ) E(z i ) = µ i cov(z i, z j ) = σ ij =
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationSTAT 540: Data Analysis and Regression
STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State
More informationChapter 3. Matrices. 3.1 Matrices
40 Chapter 3 Matrices 3.1 Matrices Definition 3.1 Matrix) A matrix A is a rectangular array of m n real numbers {a ij } written as a 11 a 12 a 1n a 21 a 22 a 2n A =.... a m1 a m2 a mn The array has m rows
More informationLecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices
Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is
More informationQuick Review on Linear Multiple Regression
Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,
More informationwhere x and ȳ are the sample means of x 1,, x n
y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =
More informationBIOS 2083 Linear Models c Abdus S. Wahed
Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter
More informationProbability and Statistics Notes
Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring 2011 1 / 42 Outline
More informationFundamentals of Matrices
Maschinelles Lernen II Fundamentals of Matrices Christoph Sawade/Niels Landwehr/Blaine Nelson Tobias Scheffer Matrix Examples Recap: Data Linear Model: f i x = w i T x Let X = x x n be the data matrix
More informationBayes Estimators & Ridge Regression
Readings Chapter 14 Christensen Merlise Clyde September 29, 2015 How Good are Estimators? Quadratic loss for estimating β using estimator a L(β, a) = (β a) T (β a) How Good are Estimators? Quadratic loss
More informationTopic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form
Topic 7 - Matrix Approach to Simple Linear Regression Review of Matrices Outline Regression model in matrix form - Fall 03 Calculations using matrices Topic 7 Matrix Collection of elements arranged in
More informationEstimating Estimable Functions of β. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 17
Estimating Estimable Functions of β Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 / 7 The Response Depends on β Only through Xβ In the Gauss-Markov or Normal Theory Gauss-Markov Linear
More informationIntroduction to Simple Linear Regression
Introduction to Simple Linear Regression Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Introduction to Simple Linear Regression 1 / 68 About me Faculty in the Department
More information9.1 Orthogonal factor model.
36 Chapter 9 Factor Analysis Factor analysis may be viewed as a refinement of the principal component analysis The objective is, like the PC analysis, to describe the relevant variables in study in terms
More informationExam 2. Jeremy Morris. March 23, 2006
Exam Jeremy Morris March 3, 006 4. Consider a bivariate normal population with µ 0, µ, σ, σ and ρ.5. a Write out the bivariate normal density. The multivariate normal density is defined by the following
More informationRegression diagnostics
Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model
More informationLinear Models and Estimation by Least Squares
Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:
More information4 Multiple Linear Regression
4 Multiple Linear Regression 4. The Model Definition 4.. random variable Y fits a Multiple Linear Regression Model, iff there exist β, β,..., β k R so that for all (x, x 2,..., x k ) R k where ε N (, σ
More informationEcon 620. Matrix Differentiation. Let a and x are (k 1) vectors and A is an (k k) matrix. ) x. (a x) = a. x = a (x Ax) =(A + A (x Ax) x x =(A + A )
Econ 60 Matrix Differentiation Let a and x are k vectors and A is an k k matrix. a x a x = a = a x Ax =A + A x Ax x =A + A x Ax = xx A We don t want to prove the claim rigorously. But a x = k a i x i i=
More informationMATH 829: Introduction to Data Mining and Analysis Linear Regression: statistical tests
1/16 MATH 829: Introduction to Data Mining and Analysis Linear Regression: statistical tests Dominique Guillot Departments of Mathematical Sciences University of Delaware February 17, 2016 Statistical
More informationLecture Notes on Different Aspects of Regression Analysis
Andreas Groll WS 2012/2013 Lecture Notes on Different Aspects of Regression Analysis Department of Mathematics, Workgroup Financial Mathematics, Ludwig-Maximilians-University Munich, Theresienstr. 39,
More informationLinear Algebra Review
Linear Algebra Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Linear Algebra Review 1 / 45 Definition of Matrix Rectangular array of elements arranged in rows and
More information1 Data Arrays and Decompositions
1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is
More information11 Hypothesis Testing
28 11 Hypothesis Testing 111 Introduction Suppose we want to test the hypothesis: H : A q p β p 1 q 1 In terms of the rows of A this can be written as a 1 a q β, ie a i β for each row of A (here a i denotes
More informationMatrix Factorizations
1 Stat 540, Matrix Factorizations Matrix Factorizations LU Factorization Definition... Given a square k k matrix S, the LU factorization (or decomposition) represents S as the product of two triangular
More informationOutline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model
Outline 1 Multiple Linear Regression (Estimation, Inference, Diagnostics and Remedial Measures) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression
More informationChapter 5 Matrix Approach to Simple Linear Regression
STAT 525 SPRING 2018 Chapter 5 Matrix Approach to Simple Linear Regression Professor Min Zhang Matrix Collection of elements arranged in rows and columns Elements will be numbers or symbols For example:
More informationData Mining Stat 588
Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic
More informationDistributions of Quadratic Forms. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 31
Distributions of Quadratic Forms Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 31 Under the Normal Theory GMM (NTGMM), y = Xβ + ε, where ε N(0, σ 2 I). By Result 5.3, the NTGMM
More informationA Note on UMPI F Tests
A Note on UMPI F Tests Ronald Christensen Professor of Statistics Department of Mathematics and Statistics University of New Mexico May 22, 2015 Abstract We examine the transformations necessary for establishing
More information18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013
18.S096 Problem Set 3 Fall 013 Regression Analysis Due Date: 10/8/013 he Projection( Hat ) Matrix and Case Influence/Leverage Recall the setup for a linear regression model y = Xβ + ɛ where y and ɛ are
More informationMatrix Algebra, part 2
Matrix Algebra, part 2 Ming-Ching Luoh 2005.9.12 1 / 38 Diagonalization and Spectral Decomposition of a Matrix Optimization 2 / 38 Diagonalization and Spectral Decomposition of a Matrix Also called Eigenvalues
More informationMa 3/103: Lecture 24 Linear Regression I: Estimation
Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the
More informationA Vector Space Justification of Householder Orthogonalization
A Vector Space Justification of Householder Orthogonalization Ronald Christensen Professor of Statistics Department of Mathematics and Statistics University of New Mexico August 28, 2015 Abstract We demonstrate
More informationWLS and BLUE (prelude to BLUP) Prediction
WLS and BLUE (prelude to BLUP) Prediction Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 21, 2018 Suppose that Y has mean X β and known covariance matrix V (but Y need
More informationSTAT 8260 Theory of Linear Models Lecture Notes
STAT 8260 Theory of Linear Models Lecture Notes Classical linear models are at the core of the field of statistics, and are probably the most commonly used set of statistical techniques in practice. For
More informationChapter 6: Orthogonality
Chapter 6: Orthogonality (Last Updated: November 7, 7) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved around.. Inner products
More informationMultivariate Linear Regression Models
Multivariate Linear Regression Models Regression analysis is used to predict the value of one or more responses from a set of predictors. It can also be used to estimate the linear association between
More informationSTA 2101/442 Assignment Four 1
STA 2101/442 Assignment Four 1 One version of the general linear model with fixed effects is y = Xβ + ɛ, where X is an n p matrix of known constants with n > p and the columns of X linearly independent.
More informationPCA and admixture models
PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional
More informationPh.D. Qualifying Exam Friday Saturday, January 3 4, 2014
Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently
More informationSTAT 151A: Lab 1. 1 Logistics. 2 Reference. 3 Playing with R: graphics and lm() 4 Random vectors. Billy Fang. 2 September 2017
STAT 151A: Lab 1 Billy Fang 2 September 2017 1 Logistics Billy Fang (blfang@berkeley.edu) Office hours: Monday 9am-11am, Wednesday 10am-12pm, Evans 428 (room changes will be written on the chalkboard)
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationLinear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77
Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical
More informationEconomics 620, Lecture 5: exp
1 Economics 620, Lecture 5: The K-Variable Linear Model II Third assumption (Normality): y; q(x; 2 I N ) 1 ) p(y) = (2 2 ) exp (N=2) 1 2 2(y X)0 (y X) where N is the sample size. The log likelihood function
More informationRidge Regression. Flachs, Munkholt og Skotte. May 4, 2009
Ridge Regression Flachs, Munkholt og Skotte May 4, 2009 As in usual regression we consider a pair of random variables (X, Y ) with values in R p R and assume that for some (β 0, β) R +p it holds that E(Y
More informationStat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Mixed models Yan Lu March, 2018, week 8 1 / 32 Restricted Maximum Likelihood (REML) REML: uses a likelihood function calculated from the transformed set
More informationIntroduction to Normal Distribution
Introduction to Normal Distribution Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 17-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Introduction
More informationI i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.
Serik Sagitov, Chalmers and GU, February, 08 Solutions chapter Matlab commands: x = data matrix boxplot(x) anova(x) anova(x) Problem.3 Consider one-way ANOVA test statistic For I = and = n, put F = MS
More informationML and REML Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 58
ML and REML Variance Component Estimation Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 58 Suppose y = Xβ + ε, where ε N(0, Σ) for some positive definite, symmetric matrix Σ.
More informationREGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University
REGRESSION ITH SPATIALL MISALIGNED DATA Lisa Madsen Oregon State University David Ruppert Cornell University SPATIALL MISALIGNED DATA 10 X X X X X X X X 5 X X X X X 0 X 0 5 10 OUTLINE 1. Introduction 2.
More informationGeneral Linear Model: Statistical Inference
Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter 4), least
More informationThe Multivariate Gaussian Distribution
The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance
More information