ST 740: Linear Models and Multivariate Normal Inference
|
|
- Martin Thornton
- 5 years ago
- Views:
Transcription
1 ST 740: Linear Models and Multivariate Normal Inference Alyson Wilson Department of Statistics North Carolina State University November 4, 2013 A. Wilson (NCSU STAT) Linear Models November 4, / 28
2 Normal Linear Regression Model We assume that E[Y x] = β 1 x 1 + β 2 x β p x p and that Y i = β 1 x 1 + β 2 x β p x p + ɛ i = β T x + ɛ i where ɛ i are i.i.d. Normal(0, σ 2 ). This implies p(y 1,..., y n x 1,..., x n, β, σ 2 ) = n p(y i x i, β, σ 2 ) i=1 = (2πσ 2 ) n/2 exp( 1 2σ 2 n (y i β T x i ) 2 ) i=1 A. Wilson (NCSU STAT) Linear Models November 4, / 28
3 Normal Linear Regression Model Multivariate Normal If we think of y as the n-dimensional column vector (y 1,..., y n ) T, and X as the n p marix with ith row x i, then we can write the normal linear regression model as y X, β, σ 2 MVN(Xβ, σ 2 I ) A. Wilson (NCSU STAT) Linear Models November 4, / 28
4 Multivariate Normal Sampling Distribution Assume that we have data y 1,..., y n with a multivariate normal sampling distribution. If y i is a p-dimensional vector, we have f (y i θ, Σ) = (2π) p/2 Σ 1/2 exp( (y i θ) T Σ 1 (y i θ)/2) A. Wilson (NCSU STAT) Linear Models November 4, / 28
5 Multivariate Normal Inference Prior Distribution for θ If our sampling distribution is univariate normal, we saw that it was often (computationally) convenient to specify the prior distribution for the mean as a normal distribution. Suppose that we decide to specify the prior distribution for θ, the mean of our multivariate normal sampling distribution, as MVN(µ 0, Λ 0 ). What is the full conditional distribution of θ given y and Σ? To figure this out, we ll need the likelihood (MVN(θ, Σ)), the prior distribution for θ, and some information about the prior distribution for Σ. A. Wilson (NCSU STAT) Linear Models November 4, / 28
6 Prior Distribution for θ First, let s rewrite the prior distribution for θ. π(θ µ 0, Λ 0 ) = (2π) p/2 Λ 0 1/2 exp( (θ µ 0 ) T Λ 1 0 (θ µ 0)/2) = (2π) p/2 Λ 0 1/2 [ exp 1 2 θt Λ 1 0 θ + θt Λ 1 0 µ 0 1 ] 2 µt 0 Λ 1 0 µ 0 [ exp 1 ] 2 θt Λ 1 0 θ + θt Λ 1 0 µ 0 exp [ 12 ] θt A 0 θ + θ T b 0 where A 0 = Λ 1 0 and b 0 = Λ 1 0 µ 0. Notice that the prior covariance matrix is A 1 0 and the prior mean is A 1 0 b 0. A. Wilson (NCSU STAT) Linear Models November 4, / 28
7 Sampling Distribution Now, let s take a look at the sampling distribution for y 1,..., y n. π(y 1,..., y n θ, Σ) = n (2π) p/2 Σ 1/2 exp( (y i θ) T Σ 1 (y i θ)/2) i=1 = (2π) np/2 Σ n/2 exp Σ n/2 exp Σ n/2 exp [ [ i=1 [ 1 2 ] n (y i θ)σ 1 (y i θ) i=1 ] n [ yi T Σ 1 y i exp n ] 2 θt Σ 1 θ + θ T Σ 1 nȳ n i=1 where A 1 = nσ 1 and b 1 = nσ 1 ȳ. ] yi T Σ 1 y i exp [ 12 ] θt A 1 θ + θ T b 1 A. Wilson (NCSU STAT) Linear Models November 4, / 28
8 Posterior/Full Conditional Distribution for θ [ ] π(θ, Σ y 1,... y n ) π(σ) Σ n/2 exp 1 n yi T Σ 1 y i 2 i=1 exp [ 12 ] θt A 1 θ + θ T b 1 exp [ 12 ] θt A 0 θ + θ T b 0 A. Wilson (NCSU STAT) Linear Models November 4, / 28
9 Posterior/Full Conditional Distribution for θ Assuming that the prior distribution for Σ does not depend on θ, π(θ Σ, y 1,..., y n ) exp [ 12 ] θt A 1 θ + θ T b 1 exp [ 12 ] θt A 0 θ + θ T b 0 [ exp 1 ] 2 θt (A 0 + A 1 )θ + θ T (b 0 + b 1 ) This is a MVN with covariance matrix and mean (A 0 + A 1 ) 1 = (Λ nσ 1 ) 1 (A 0 + A 1 ) 1 (b 0 + b 1 ) = (Λ nσ 1 ) 1 (Λ 1 0 µ 0 + nσ 1 ȳ) A. Wilson (NCSU STAT) Linear Models November 4, / 28
10 Full Conditional Distribution for θ This is a MVN with covariance matrix (A 0 + A 1 ) 1 = (Λ nσ 1 ) 1 The precision (inverse variance) is the sum of the prior precision and the data precision. The mean is (A 0 + A 1 ) 1 (b 0 + b 1 ) = (Λ nσ 1 ) 1 (Λ 1 0 µ 0 + nσ 1 ȳ) which is a weighted average of the prior mean and the sample mean. A. Wilson (NCSU STAT) Linear Models November 4, / 28
11 Multivariate Normal Inference Prior Distribution for Σ To construct a prior distribution for Σ, we need a distribution over symmetric, positive definite matrices. The idea is a generalization of the chi-squared (gamma) distribution. 1 Sample z 1,..., z ν0 i.i.d. MVN(0, Φ 0 ) 2 Calculate Z T Z = ν 0 i=1 z iz T i The distribution of the matrices Z T Z is called a Wishart(ν 0, Φ 0 ) distribution. A. Wilson (NCSU STAT) Linear Models November 4, / 28
12 Wishart Properties If ν 0 > p then Z T Z is positive definite with probability 1. Z T Z is symmetric with probability 1. E[Z T Z] = ν 0 Φ 0 By analogy with the univariate normal case, we use the Wishart distribution as a prior for the precision matrix and the inverse Wishart distribution as a prior for the covariance matrix. A. Wilson (NCSU STAT) Linear Models November 4, / 28
13 Prior Distribution for Σ Inverse Wishart Distribution 1 Sample z 1,..., z ν0 i.i.d. MVN(0, S 1 0 ) 2 Calculate (Z T Z) 1 = ( ν 0 i=1 z iz T i ) 1 The distribution of the matrices (Z T Z) 1 is called an inverse Wishart(ν 0, S 1 0 ) distribution. A. Wilson (NCSU STAT) Linear Models November 4, / 28
14 Inverse Wishart Distribution The density function is E[Z T Z] = ν 0 S0 1 E[(Z T Z) 1 ] = 1 ν 0 p 1 S 0 π(σ) = 2 ν0p/2 π (p2)/2 S 0 ν 0/2 p Γ([ν j]/2) j=1 Σ (ν 0+p+1)/2 exp[ tr(s 0 Σ 1 )/2] 1 A. Wilson (NCSU STAT) Linear Models November 4, / 28
15 Choosing Parameters For the inverse Wishart distribution, think of ν 0 as the prior sample size. If we are confident that the true covariance matrix is near some covariance matrix Σ 0, choose ν 0 large and set S 0 = (ν 0 p 1)Σ 0. If we choose ν 0 = p + 2 and S 0 = Σ 0, then the samples will be loosely centered around Σ 0. A. Wilson (NCSU STAT) Linear Models November 4, / 28
16 Full Conditional Distribution for Σ Useful Matrix Algebra Fact K b T k Ab k = tr(b T BA) i=1 where B is the matrix whose kth row is b T k. Therefore, n (y i θ) T Σ 1 (y i θ) = tr(s θ Σ 1 ) i=1 where S θ = n i=1 (y i θ)(y i θ) T is the residual sum of squares matrix for y 1,..., y n. A. Wilson (NCSU STAT) Linear Models November 4, / 28
17 Full Conditional Distribution for Σ π(θ, Σ y 1,..., y n ) π(θ) Σ n/2 exp ( 1 2 ) n (y i θ) T Σ 1 (y i θ) Σ (ν0+p+1)/2 exp[ tr(s 0 Σ 1 )/2] ( π(σ θ, y 1,..., y n ) Σ (n+ν0+p+1)/2 exp 1 ) 2 tr((s 0 + S θ )Σ 1 ) So the full conditional distribution of Σ is Inverse Wishart(ν 0 + n, [S 0 + S θ ] 1 ). i=1 A. Wilson (NCSU STAT) Linear Models November 4, / 28
18 Noninformative Prior for Multivariate Normal Data A commonly used noninformative prior is π(θ, Σ) Σ (p+1)/2 A. Wilson (NCSU STAT) Linear Models November 4, / 28
19 MCMC for Multivariate Normal Model To summarize, suppose that our model is Y i θ, Σ MVN(θ, Σ) θ MVN(µ 0, Λ 0 ) Σ Inverse Wishart(ν 0, S 1 0 ) We can draw a random sample from our posterior distribution using Gibbs sampling. π(θ y 1,..., y n, Σ) MVN((Λ nσ 1 ) 1 (Λ 1 0 µ 0 + nσ 1 ȳ), (Λ nσ 1 ) 1 ) π(σ y 1,..., y n, θ) Inverse Wishart(ν 0 + n, [S 0 + S θ ] 1 ) The R package MCMCpack has a random inverse Wishart distribution riwish(v,s). It seems to be built for Gibbs sampling it generates only one random draw at a time. A. Wilson (NCSU STAT) Linear Models November 4, / 28
20 Classical Approach From a classical approach, we choose an estimator for β that minimzes n i=1 (y i β T x i ) 2. We will call β OLS the ordinary least squares estimate of β. A. Wilson (NCSU STAT) Linear Models November 4, / 28
21 Bayesian Approach Semi-Conjugate Prior Distribution for β Suppose that we choose a prior distribution for β that is MVN(β 0, Σ 0 ). π(β y, X, σ 2 ) p(y X, β, σ 2 )π(β)π(σ 2 ) ( exp 1 ) 2σ 2 (yt y 2β T X T y + β T X T Xβ) ( exp 1 ) 2 βt Σ 1 0 β + βt Σ 1 0 β 0 exp ( 12 ) βt X T Xβ/σ 2 + β T X T y/σ 2 ( exp 1 ) 2 βt Σ 1 0 β + βt Σ 1 0 β 0 ( exp 1 ) 2 βt (X T X/σ 2 + Σ 1 0 )β + βt (X T y/σ 2 + Σ 1 0 β 0) A. Wilson (NCSU STAT) Linear Models November 4, / 28
22 Bayesian Approach Semi-Conjugate Prior Distribution for β The full conditional distribution for β, π(β y, X, σ 2 ) is ( MVN (X T X/σ 2 + Σ 1 0 ) 1 (X T y/σ 2 + Σ 1 0 β 0), (X T X/σ 2 + Σ 1 0 ) 1) A. Wilson (NCSU STAT) Linear Models November 4, / 28
23 Bayesian Approach Semi-Conjugate Prior Distribution for σ 2 Suppose that we choose a prior distribution for σ 2 that is InverseGamma(ν 0 /2, ν 0 σ 2 0 /2). π(σ 2 y, X, β) p(y X, β, σ 2 )π(β)π(σ 2 ) ( (σ 2 ) n/2 exp 1 ) 2σ 2 (yt y 2β T X T y + β T X T Xβ) (1/σ 2 ) ν0/2+1 exp( ν 0 σ0/2σ 2 2 ) ( (1/σ 2 ) (n+ν0)/2 exp 1 ) 2σ 2 (ν 0σ0 2 + (y Xβ) T (y Xβ) which is InverseGamma((n + ν 0 )/2, (ν 0 σ (y Xβ)T (y Xβ)). A. Wilson (NCSU STAT) Linear Models November 4, / 28
24 Other Prior Distributions One common (and convenient) noninformative prior distribution is uniform on (β, log(σ)), In this case, we can show that π(β, σ 2 ) 1 σ 2 π(β σ 2, y, X) MVN((X T X) 1 X T y, (X T X) 1 σ 2 ) π(σ 2 y, X) InverseGamma( n p, (y X ˆβ OLS ) T (y X ˆβ OLS )) It is straightforward to derive all of the full conditionals and marginals for this model, so there are many choices for calculating samples from the posterior distribution. A. Wilson (NCSU STAT) Linear Models November 4, / 28
25 Other Prior Distributions When we discussed noninformative prior distributions, we learned that prior distributions can be selected to capture many different kinds of knowledge. If we do not have much knowledge about the values of the regression parameters, we may wish to choose a prior distribution that is somehow minimally informative. A. Wilson (NCSU STAT) Linear Models November 4, / 28
26 Other Prior Distributions Unit Information Prior One suggestion for the normal linear regression model comes from Kass and Wasserman (1995). The idea is to create a prior distribution that contains the same amount of information as one observation. Recall that the variance of ˆβ OLS is σ 2 (X T X) 1. If we think about the precision (inverse variance) of this estimate, we have (X T X)/σ 2. The idea is that the precision of ˆβ OLS is the amount of information in n observations. This implies that the amount of information in one observation is one n th as much, or (X T X)/nσ 2. Suppose that we set Σ 0 = nσ 2 (X T X) 1 in the prior distribution for β. They also suggest setting β 0 = ˆβ OLS, ν 0 = 1 and σ 2 0 = (ˆσ2 0 ) OLS. A. Wilson (NCSU STAT) Linear Models November 4, / 28
27 Other Prior Distributions g-prior The g-prior starts from the idea that parameter estimation should be invariant to the scale of the regressors. Suppose that X is a given set of regressors and X = XH for some p p matrix H. If we obtain the posterior distribution of β from y and X and β from y and X, then, according to this principle of invariance, the posterior distributions of β and H β should be the same. π(β σ 2 ) MVN(0, k(x T X) 1 ) π(σ 2 ) InverseGamma(ν 0 /2, ν 0 σ 2 0/2) A popular choice for k is gσ 2. Choosing g large reflects less informative prior beliefs. A. Wilson (NCSU STAT) Linear Models November 4, / 28
28 Other Prior Distributions g-prior In this case, we can show that π(β σ 2 g, y, X) MVN( g + 1 (XT X) 1 X T g y, g + 1 (XT X) 1 σ 2 ) π(σ 2 y, X) InverseGamma( n + ν 0, ν 0σ0 2 + SSR g ) 2 2 where SSR g = y T (I g g+1 X(XT X) 1 X T )y. Another popular noninformative choice is ν 0 = 0. A. Wilson (NCSU STAT) Linear Models November 4, / 28
ST 740: Model Selection
ST 740: Model Selection Alyson Wilson Department of Statistics North Carolina State University November 25, 2013 A. Wilson (NCSU Statistics) Model Selection November 25, 2013 1 / 29 Formal Bayesian Model
More informationThe linear model is the most fundamental of all serious statistical models encompassing:
Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x
More informationModule 17: Bayesian Statistics for Genetics Lecture 4: Linear regression
1/37 The linear regression model Module 17: Bayesian Statistics for Genetics Lecture 4: Linear regression Ken Rice Department of Biostatistics University of Washington 2/37 The linear regression model
More informationConjugate Analysis for the Linear Model
Conjugate Analysis for the Linear Model If we have good prior knowledge that can help us specify priors for β and σ 2, we can use conjugate priors. Following the procedure in Christensen, Johnson, Branscum,
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More informationModule 4: Bayesian Methods Lecture 5: Linear regression
1/28 The linear regression model Module 4: Bayesian Methods Lecture 5: Linear regression Peter Hoff Departments of Statistics and Biostatistics University of Washington 2/28 The linear regression model
More informationA Bayesian Treatment of Linear Gaussian Regression
A Bayesian Treatment of Linear Gaussian Regression Frank Wood December 3, 2009 Bayesian Approach to Classical Linear Regression In classical linear regression we have the following model y β, σ 2, X N(Xβ,
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public
More informationModule 11: Linear Regression. Rebecca C. Steorts
Module 11: Linear Regression Rebecca C. Steorts Announcements Today is the last class Homework 7 has been extended to Thursday, April 20, 11 PM. There will be no lab tomorrow. There will be office hours
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationPeter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8
Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall
More informationAn Introduction to Bayesian Linear Regression
An Introduction to Bayesian Linear Regression APPM 5720: Bayesian Computation Fall 2018 A SIMPLE LINEAR MODEL Suppose that we observe explanatory variables x 1, x 2,..., x n and dependent variables y 1,
More informationvariability of the model, represented by σ 2 and not accounted for by Xβ
Posterior Predictive Distribution Suppose we have observed a new set of explanatory variables X and we want to predict the outcomes ỹ using the regression model. Components of uncertainty in p(ỹ y) variability
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationMultivariate Normal & Wishart
Multivariate Normal & Wishart Hoff Chapter 7 October 21, 2010 Reading Comprehesion Example Twenty-two children are given a reading comprehsion test before and after receiving a particular instruction method.
More informationFoundations of Statistical Inference
Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2015 Julien Berestycki (University of Oxford) SB2a MT 2015 1 / 16 Lecture 16 : Bayesian analysis
More informationBayesian Inference. Chapter 9. Linear models and regression
Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering
More informationSummer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.
Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall
More informationMatrix Approach to Simple Linear Regression: An Overview
Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix
More informationUSEFUL PROPERTIES OF THE MULTIVARIATE NORMAL*
USEFUL PROPERTIES OF THE MULTIVARIATE NORMAL* 3 Conditionals and marginals For Bayesian analysis it is very useful to understand how to write joint, marginal, and conditional distributions for the multivariate
More informationLecture 16 : Bayesian analysis of contingency tables. Bayesian linear regression. Jonathan Marchini (University of Oxford) BS2a MT / 15
Lecture 16 : Bayesian analysis of contingency tables. Bayesian linear regression. Jonathan Marchini (University of Oxford) BS2a MT 2013 1 / 15 Contingency table analysis North Carolina State University
More informationRestricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model
Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives
More informationBayesian Inference. Chapter 4: Regression and Hierarchical Models
Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School
More informationHierarchical Modeling for Univariate Spatial Data
Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationGibbs Sampling in Linear Models #2
Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling
More informationLinear Models A linear model is defined by the expression
Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose
More informationBayesian linear regression
Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding
More informationBayesian Inference. Chapter 4: Regression and Hierarchical Models
Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative
More informationMCMC algorithms for fitting Bayesian models
MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models
More informationMultinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is
Multinomial Data The multinomial distribution is a generalization of the binomial for the situation in which each trial results in one and only one of several categories, as opposed to just two, as in
More informationThe Multivariate Normal Distribution 1
The Multivariate Normal Distribution 1 STA 302 Fall 2014 1 See last slide for copyright information. 1 / 37 Overview 1 Moment-generating Functions 2 Definition 3 Properties 4 χ 2 and t distributions 2
More informationBayesian Linear Regression [DRAFT - In Progress]
Bayesian Linear Regression [DRAFT - In Progress] David S. Rosenberg Abstract Here we develop some basics of Bayesian linear regression. Most of the calculations for this document come from the basic theory
More information5.2 Expounding on the Admissibility of Shrinkage Estimators
STAT 383C: Statistical Modeling I Fall 2015 Lecture 5 September 15 Lecturer: Purnamrita Sarkar Scribe: Ryan O Donnell Disclaimer: These scribe notes have been slightly proofread and may have typos etc
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationSTAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.
STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. Rebecca Barter May 5, 2015 Linear Regression Review Linear Regression Review
More informationThe Wishart distribution Scaled Wishart. Wishart Priors. Patrick Breheny. March 28. Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11
Wishart Priors Patrick Breheny March 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11 Introduction When more than two coefficients vary, it becomes difficult to directly model each element
More informationSTA 2201/442 Assignment 2
STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution
More informationMore Spectral Clustering and an Introduction to Conjugacy
CS8B/Stat4B: Advanced Topics in Learning & Decision Making More Spectral Clustering and an Introduction to Conjugacy Lecturer: Michael I. Jordan Scribe: Marco Barreno Monday, April 5, 004. Back to spectral
More informationBayesian Classification Methods
Bayesian Classification Methods Suchit Mehrotra North Carolina State University smehrot@ncsu.edu October 24, 2014 Suchit Mehrotra (NCSU) Bayesian Classification October 24, 2014 1 / 33 How do you define
More informationBayesian Inference for Normal Mean
Al Nosedal. University of Toronto. November 18, 2015 Likelihood of Single Observation The conditional observation distribution of y µ is Normal with mean µ and variance σ 2, which is known. Its density
More informationLecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices
Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is
More informationIntroduction to Bayesian Methods
Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative
More informationHierarchical Modelling for Univariate Spatial Data
Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department
More informationPart 8: GLMs and Hierarchical LMs and GLMs
Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix
More informationMultivariate Bayesian Linear Regression MLAI Lecture 11
Multivariate Bayesian Linear Regression MLAI Lecture 11 Neil D. Lawrence Department of Computer Science Sheffield University 21st October 2012 Outline Univariate Bayesian Linear Regression Multivariate
More informationPMR Learning as Inference
Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning
More informationMa 3/103: Lecture 24 Linear Regression I: Estimation
Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the
More informationNovember 2002 STA Random Effects Selection in Linear Mixed Models
November 2002 STA216 1 Random Effects Selection in Linear Mixed Models November 2002 STA216 2 Introduction It is common practice in many applications to collect multiple measurements on a subject. Linear
More informationBayesian Linear Models
Eric F. Lock UMN Division of Biostatistics, SPH elock@umn.edu 03/07/2018 Linear model For observations y 1,..., y n, the basic linear model is y i = x 1i β 1 +... + x pi β p + ɛ i, x 1i,..., x pi are predictors
More informationMotivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University
Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined
More informationEstimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator
Estimation Theory Estimation theory deals with finding numerical values of interesting parameters from given set of data. We start with formulating a family of models that could describe how the data were
More informationBayesian Regression Linear and Logistic Regression
When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we
More informationA Very Brief Summary of Bayesian Inference, and Examples
A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X
More information1 Bayesian Linear Regression (BLR)
Statistical Techniques in Robotics (STR, S15) Lecture#10 (Wednesday, February 11) Lecturer: Byron Boots Gaussian Properties, Bayesian Linear Regression 1 Bayesian Linear Regression (BLR) In linear regression,
More informationStat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2
Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate
More informationLinear Regression (9/11/13)
STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter
More informationBayes: All uncertainty is described using probability.
Bayes: All uncertainty is described using probability. Let w be the data and θ be any unknown quantities. Likelihood. The probability model π(w θ) has θ fixed and w varying. The likelihood L(θ; w) is π(w
More informationLabor-Supply Shifts and Economic Fluctuations. Technical Appendix
Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January
More informationGibbs Sampling in Endogenous Variables Models
Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take
More informationST 740: Multiparameter Inference
ST 740: Multiparameter Inference Alyson Wilson Department of Statistics North Carolina State University September 23, 2013 A. Wilson (NCSU Statistics) Multiparameter Inference September 23, 2013 1 / 21
More informationNotes on the Multivariate Normal and Related Topics
Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationBayesian Multivariate Logistic Regression
Bayesian Multivariate Logistic Regression Sean M. O Brien and David B. Dunson Biostatistics Branch National Institute of Environmental Health Sciences Research Triangle Park, NC 1 Goals Brief review of
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Addressing ourliers 1 Addressing ourliers 2 Outliers in Multivariate samples (1) For
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationModeling Real Estate Data using Quantile Regression
Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices
More informationLinear Regression and Discrimination
Linear Regression and Discrimination Kernel-based Learning Methods Christian Igel Institut für Neuroinformatik Ruhr-Universität Bochum, Germany http://www.neuroinformatik.rub.de July 16, 2009 Christian
More informationAMS-207: Bayesian Statistics
Linear Regression How does a quantity y, vary as a function of another quantity, or vector of quantities x? We are interested in p(y θ, x) under a model in which n observations (x i, y i ) are exchangeable.
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationMEI Exam Review. June 7, 2002
MEI Exam Review June 7, 2002 1 Final Exam Revision Notes 1.1 Random Rules and Formulas Linear transformations of random variables. f y (Y ) = f x (X) dx. dg Inverse Proof. (AB)(AB) 1 = I. (B 1 A 1 )(AB)(AB)
More informationModels for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data
Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee September 03 05, 2017 Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles Linear Regression Linear regression is,
More informationChapter 17: Undirected Graphical Models
Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)
More informationMultivariate Analysis and Likelihood Inference
Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationLecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011
Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector
More information1 Priors, Contd. ISyE8843A, Brani Vidakovic Handout Nuisance Parameters
ISyE8843A, Brani Vidakovic Handout 6 Priors, Contd.. Nuisance Parameters Assume that unknown parameter is θ, θ ) but we are interested only in θ. The parameter θ is called nuisance parameter. How do we
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationThe Normal Linear Regression Model with Natural Conjugate Prior. March 7, 2016
The Normal Linear Regression Model with Natural Conjugate Prior March 7, 2016 The Normal Linear Regression Model with Natural Conjugate Prior The plan Estimate simple regression model using Bayesian methods
More informationBayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence
Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns
More informationMetropolis-Hastings Algorithm
Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to
More informationThe Relationship Between the Power Prior and Hierarchical Models
Bayesian Analysis 006, Number 3, pp. 55 574 The Relationship Between the Power Prior and Hierarchical Models Ming-Hui Chen, and Joseph G. Ibrahim Abstract. The power prior has emerged as a useful informative
More informationGaussian Models (9/9/13)
STA561: Probabilistic machine learning Gaussian Models (9/9/13) Lecturer: Barbara Engelhardt Scribes: Xi He, Jiangwei Pan, Ali Razeen, Animesh Srivastava 1 Multivariate Normal Distribution The multivariate
More informationIntroduction to Gaussian Processes
Introduction to Gaussian Processes Neil D. Lawrence GPSS 10th June 2013 Book Rasmussen and Williams (2006) Outline The Gaussian Density Covariance from Basis Functions Basis Function Representations Constructing
More informationBayesian Inference for Regression Parameters
Bayesian Inference for Regression Parameters 1 Bayesian inference for simple linear regression parameters follows the usual pattern for all Bayesian analyses: 1. Form a prior distribution over all unknown
More informationA tutorial on. Richard Boys
A tutorial on Bayesian inference for the normal linear model Richard Boys Statistics Research Group, Newcastle University, UK Motivation Much of Bayesian inference nowadays analyses complex hierarchical
More informationCorrelation and Regression
Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class
More informationMultivariate spatial models and the multikrig class
Multivariate spatial models and the multikrig class Stephan R Sain, IMAGe, NCAR ENAR Spring Meetings March 15, 2009 Outline Overview of multivariate spatial regression models Case study: pedotransfer functions
More informationOutline. Motivation Contest Sample. Estimator. Loss. Standard Error. Prior Pseudo-Data. Bayesian Estimator. Estimators. John Dodson.
s s Practitioner Course: Portfolio Optimization September 24, 2008 s The Goal of s The goal of estimation is to assign numerical values to the parameters of a probability model. Considerations There are
More informationECON The Simple Regression Model
ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationAdvanced Statistical Modelling
Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1
More informationSimple Linear Regression
Simple Linear Regression Reading: Hoff Chapter 9 November 4, 2009 Problem Data: Observe pairs (Y i,x i ),i = 1,... n Response or dependent variable Y Predictor or independent variable X GOALS: Exploring
More informationRidge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation
Patrick Breheny February 8 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Large-scale testing is, of course, a big area and we could keep talking
More informationLecture 3. Univariate Bayesian inference: conjugate analysis
Summary Lecture 3. Univariate Bayesian inference: conjugate analysis 1. Posterior predictive distributions 2. Conjugate analysis for proportions 3. Posterior predictions for proportions 4. Conjugate analysis
More informationWeighted Least Squares
Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w
More information