ST 740: Linear Models and Multivariate Normal Inference

Size: px
Start display at page:

Download "ST 740: Linear Models and Multivariate Normal Inference"

Transcription

1 ST 740: Linear Models and Multivariate Normal Inference Alyson Wilson Department of Statistics North Carolina State University November 4, 2013 A. Wilson (NCSU STAT) Linear Models November 4, / 28

2 Normal Linear Regression Model We assume that E[Y x] = β 1 x 1 + β 2 x β p x p and that Y i = β 1 x 1 + β 2 x β p x p + ɛ i = β T x + ɛ i where ɛ i are i.i.d. Normal(0, σ 2 ). This implies p(y 1,..., y n x 1,..., x n, β, σ 2 ) = n p(y i x i, β, σ 2 ) i=1 = (2πσ 2 ) n/2 exp( 1 2σ 2 n (y i β T x i ) 2 ) i=1 A. Wilson (NCSU STAT) Linear Models November 4, / 28

3 Normal Linear Regression Model Multivariate Normal If we think of y as the n-dimensional column vector (y 1,..., y n ) T, and X as the n p marix with ith row x i, then we can write the normal linear regression model as y X, β, σ 2 MVN(Xβ, σ 2 I ) A. Wilson (NCSU STAT) Linear Models November 4, / 28

4 Multivariate Normal Sampling Distribution Assume that we have data y 1,..., y n with a multivariate normal sampling distribution. If y i is a p-dimensional vector, we have f (y i θ, Σ) = (2π) p/2 Σ 1/2 exp( (y i θ) T Σ 1 (y i θ)/2) A. Wilson (NCSU STAT) Linear Models November 4, / 28

5 Multivariate Normal Inference Prior Distribution for θ If our sampling distribution is univariate normal, we saw that it was often (computationally) convenient to specify the prior distribution for the mean as a normal distribution. Suppose that we decide to specify the prior distribution for θ, the mean of our multivariate normal sampling distribution, as MVN(µ 0, Λ 0 ). What is the full conditional distribution of θ given y and Σ? To figure this out, we ll need the likelihood (MVN(θ, Σ)), the prior distribution for θ, and some information about the prior distribution for Σ. A. Wilson (NCSU STAT) Linear Models November 4, / 28

6 Prior Distribution for θ First, let s rewrite the prior distribution for θ. π(θ µ 0, Λ 0 ) = (2π) p/2 Λ 0 1/2 exp( (θ µ 0 ) T Λ 1 0 (θ µ 0)/2) = (2π) p/2 Λ 0 1/2 [ exp 1 2 θt Λ 1 0 θ + θt Λ 1 0 µ 0 1 ] 2 µt 0 Λ 1 0 µ 0 [ exp 1 ] 2 θt Λ 1 0 θ + θt Λ 1 0 µ 0 exp [ 12 ] θt A 0 θ + θ T b 0 where A 0 = Λ 1 0 and b 0 = Λ 1 0 µ 0. Notice that the prior covariance matrix is A 1 0 and the prior mean is A 1 0 b 0. A. Wilson (NCSU STAT) Linear Models November 4, / 28

7 Sampling Distribution Now, let s take a look at the sampling distribution for y 1,..., y n. π(y 1,..., y n θ, Σ) = n (2π) p/2 Σ 1/2 exp( (y i θ) T Σ 1 (y i θ)/2) i=1 = (2π) np/2 Σ n/2 exp Σ n/2 exp Σ n/2 exp [ [ i=1 [ 1 2 ] n (y i θ)σ 1 (y i θ) i=1 ] n [ yi T Σ 1 y i exp n ] 2 θt Σ 1 θ + θ T Σ 1 nȳ n i=1 where A 1 = nσ 1 and b 1 = nσ 1 ȳ. ] yi T Σ 1 y i exp [ 12 ] θt A 1 θ + θ T b 1 A. Wilson (NCSU STAT) Linear Models November 4, / 28

8 Posterior/Full Conditional Distribution for θ [ ] π(θ, Σ y 1,... y n ) π(σ) Σ n/2 exp 1 n yi T Σ 1 y i 2 i=1 exp [ 12 ] θt A 1 θ + θ T b 1 exp [ 12 ] θt A 0 θ + θ T b 0 A. Wilson (NCSU STAT) Linear Models November 4, / 28

9 Posterior/Full Conditional Distribution for θ Assuming that the prior distribution for Σ does not depend on θ, π(θ Σ, y 1,..., y n ) exp [ 12 ] θt A 1 θ + θ T b 1 exp [ 12 ] θt A 0 θ + θ T b 0 [ exp 1 ] 2 θt (A 0 + A 1 )θ + θ T (b 0 + b 1 ) This is a MVN with covariance matrix and mean (A 0 + A 1 ) 1 = (Λ nσ 1 ) 1 (A 0 + A 1 ) 1 (b 0 + b 1 ) = (Λ nσ 1 ) 1 (Λ 1 0 µ 0 + nσ 1 ȳ) A. Wilson (NCSU STAT) Linear Models November 4, / 28

10 Full Conditional Distribution for θ This is a MVN with covariance matrix (A 0 + A 1 ) 1 = (Λ nσ 1 ) 1 The precision (inverse variance) is the sum of the prior precision and the data precision. The mean is (A 0 + A 1 ) 1 (b 0 + b 1 ) = (Λ nσ 1 ) 1 (Λ 1 0 µ 0 + nσ 1 ȳ) which is a weighted average of the prior mean and the sample mean. A. Wilson (NCSU STAT) Linear Models November 4, / 28

11 Multivariate Normal Inference Prior Distribution for Σ To construct a prior distribution for Σ, we need a distribution over symmetric, positive definite matrices. The idea is a generalization of the chi-squared (gamma) distribution. 1 Sample z 1,..., z ν0 i.i.d. MVN(0, Φ 0 ) 2 Calculate Z T Z = ν 0 i=1 z iz T i The distribution of the matrices Z T Z is called a Wishart(ν 0, Φ 0 ) distribution. A. Wilson (NCSU STAT) Linear Models November 4, / 28

12 Wishart Properties If ν 0 > p then Z T Z is positive definite with probability 1. Z T Z is symmetric with probability 1. E[Z T Z] = ν 0 Φ 0 By analogy with the univariate normal case, we use the Wishart distribution as a prior for the precision matrix and the inverse Wishart distribution as a prior for the covariance matrix. A. Wilson (NCSU STAT) Linear Models November 4, / 28

13 Prior Distribution for Σ Inverse Wishart Distribution 1 Sample z 1,..., z ν0 i.i.d. MVN(0, S 1 0 ) 2 Calculate (Z T Z) 1 = ( ν 0 i=1 z iz T i ) 1 The distribution of the matrices (Z T Z) 1 is called an inverse Wishart(ν 0, S 1 0 ) distribution. A. Wilson (NCSU STAT) Linear Models November 4, / 28

14 Inverse Wishart Distribution The density function is E[Z T Z] = ν 0 S0 1 E[(Z T Z) 1 ] = 1 ν 0 p 1 S 0 π(σ) = 2 ν0p/2 π (p2)/2 S 0 ν 0/2 p Γ([ν j]/2) j=1 Σ (ν 0+p+1)/2 exp[ tr(s 0 Σ 1 )/2] 1 A. Wilson (NCSU STAT) Linear Models November 4, / 28

15 Choosing Parameters For the inverse Wishart distribution, think of ν 0 as the prior sample size. If we are confident that the true covariance matrix is near some covariance matrix Σ 0, choose ν 0 large and set S 0 = (ν 0 p 1)Σ 0. If we choose ν 0 = p + 2 and S 0 = Σ 0, then the samples will be loosely centered around Σ 0. A. Wilson (NCSU STAT) Linear Models November 4, / 28

16 Full Conditional Distribution for Σ Useful Matrix Algebra Fact K b T k Ab k = tr(b T BA) i=1 where B is the matrix whose kth row is b T k. Therefore, n (y i θ) T Σ 1 (y i θ) = tr(s θ Σ 1 ) i=1 where S θ = n i=1 (y i θ)(y i θ) T is the residual sum of squares matrix for y 1,..., y n. A. Wilson (NCSU STAT) Linear Models November 4, / 28

17 Full Conditional Distribution for Σ π(θ, Σ y 1,..., y n ) π(θ) Σ n/2 exp ( 1 2 ) n (y i θ) T Σ 1 (y i θ) Σ (ν0+p+1)/2 exp[ tr(s 0 Σ 1 )/2] ( π(σ θ, y 1,..., y n ) Σ (n+ν0+p+1)/2 exp 1 ) 2 tr((s 0 + S θ )Σ 1 ) So the full conditional distribution of Σ is Inverse Wishart(ν 0 + n, [S 0 + S θ ] 1 ). i=1 A. Wilson (NCSU STAT) Linear Models November 4, / 28

18 Noninformative Prior for Multivariate Normal Data A commonly used noninformative prior is π(θ, Σ) Σ (p+1)/2 A. Wilson (NCSU STAT) Linear Models November 4, / 28

19 MCMC for Multivariate Normal Model To summarize, suppose that our model is Y i θ, Σ MVN(θ, Σ) θ MVN(µ 0, Λ 0 ) Σ Inverse Wishart(ν 0, S 1 0 ) We can draw a random sample from our posterior distribution using Gibbs sampling. π(θ y 1,..., y n, Σ) MVN((Λ nσ 1 ) 1 (Λ 1 0 µ 0 + nσ 1 ȳ), (Λ nσ 1 ) 1 ) π(σ y 1,..., y n, θ) Inverse Wishart(ν 0 + n, [S 0 + S θ ] 1 ) The R package MCMCpack has a random inverse Wishart distribution riwish(v,s). It seems to be built for Gibbs sampling it generates only one random draw at a time. A. Wilson (NCSU STAT) Linear Models November 4, / 28

20 Classical Approach From a classical approach, we choose an estimator for β that minimzes n i=1 (y i β T x i ) 2. We will call β OLS the ordinary least squares estimate of β. A. Wilson (NCSU STAT) Linear Models November 4, / 28

21 Bayesian Approach Semi-Conjugate Prior Distribution for β Suppose that we choose a prior distribution for β that is MVN(β 0, Σ 0 ). π(β y, X, σ 2 ) p(y X, β, σ 2 )π(β)π(σ 2 ) ( exp 1 ) 2σ 2 (yt y 2β T X T y + β T X T Xβ) ( exp 1 ) 2 βt Σ 1 0 β + βt Σ 1 0 β 0 exp ( 12 ) βt X T Xβ/σ 2 + β T X T y/σ 2 ( exp 1 ) 2 βt Σ 1 0 β + βt Σ 1 0 β 0 ( exp 1 ) 2 βt (X T X/σ 2 + Σ 1 0 )β + βt (X T y/σ 2 + Σ 1 0 β 0) A. Wilson (NCSU STAT) Linear Models November 4, / 28

22 Bayesian Approach Semi-Conjugate Prior Distribution for β The full conditional distribution for β, π(β y, X, σ 2 ) is ( MVN (X T X/σ 2 + Σ 1 0 ) 1 (X T y/σ 2 + Σ 1 0 β 0), (X T X/σ 2 + Σ 1 0 ) 1) A. Wilson (NCSU STAT) Linear Models November 4, / 28

23 Bayesian Approach Semi-Conjugate Prior Distribution for σ 2 Suppose that we choose a prior distribution for σ 2 that is InverseGamma(ν 0 /2, ν 0 σ 2 0 /2). π(σ 2 y, X, β) p(y X, β, σ 2 )π(β)π(σ 2 ) ( (σ 2 ) n/2 exp 1 ) 2σ 2 (yt y 2β T X T y + β T X T Xβ) (1/σ 2 ) ν0/2+1 exp( ν 0 σ0/2σ 2 2 ) ( (1/σ 2 ) (n+ν0)/2 exp 1 ) 2σ 2 (ν 0σ0 2 + (y Xβ) T (y Xβ) which is InverseGamma((n + ν 0 )/2, (ν 0 σ (y Xβ)T (y Xβ)). A. Wilson (NCSU STAT) Linear Models November 4, / 28

24 Other Prior Distributions One common (and convenient) noninformative prior distribution is uniform on (β, log(σ)), In this case, we can show that π(β, σ 2 ) 1 σ 2 π(β σ 2, y, X) MVN((X T X) 1 X T y, (X T X) 1 σ 2 ) π(σ 2 y, X) InverseGamma( n p, (y X ˆβ OLS ) T (y X ˆβ OLS )) It is straightforward to derive all of the full conditionals and marginals for this model, so there are many choices for calculating samples from the posterior distribution. A. Wilson (NCSU STAT) Linear Models November 4, / 28

25 Other Prior Distributions When we discussed noninformative prior distributions, we learned that prior distributions can be selected to capture many different kinds of knowledge. If we do not have much knowledge about the values of the regression parameters, we may wish to choose a prior distribution that is somehow minimally informative. A. Wilson (NCSU STAT) Linear Models November 4, / 28

26 Other Prior Distributions Unit Information Prior One suggestion for the normal linear regression model comes from Kass and Wasserman (1995). The idea is to create a prior distribution that contains the same amount of information as one observation. Recall that the variance of ˆβ OLS is σ 2 (X T X) 1. If we think about the precision (inverse variance) of this estimate, we have (X T X)/σ 2. The idea is that the precision of ˆβ OLS is the amount of information in n observations. This implies that the amount of information in one observation is one n th as much, or (X T X)/nσ 2. Suppose that we set Σ 0 = nσ 2 (X T X) 1 in the prior distribution for β. They also suggest setting β 0 = ˆβ OLS, ν 0 = 1 and σ 2 0 = (ˆσ2 0 ) OLS. A. Wilson (NCSU STAT) Linear Models November 4, / 28

27 Other Prior Distributions g-prior The g-prior starts from the idea that parameter estimation should be invariant to the scale of the regressors. Suppose that X is a given set of regressors and X = XH for some p p matrix H. If we obtain the posterior distribution of β from y and X and β from y and X, then, according to this principle of invariance, the posterior distributions of β and H β should be the same. π(β σ 2 ) MVN(0, k(x T X) 1 ) π(σ 2 ) InverseGamma(ν 0 /2, ν 0 σ 2 0/2) A popular choice for k is gσ 2. Choosing g large reflects less informative prior beliefs. A. Wilson (NCSU STAT) Linear Models November 4, / 28

28 Other Prior Distributions g-prior In this case, we can show that π(β σ 2 g, y, X) MVN( g + 1 (XT X) 1 X T g y, g + 1 (XT X) 1 σ 2 ) π(σ 2 y, X) InverseGamma( n + ν 0, ν 0σ0 2 + SSR g ) 2 2 where SSR g = y T (I g g+1 X(XT X) 1 X T )y. Another popular noninformative choice is ν 0 = 0. A. Wilson (NCSU STAT) Linear Models November 4, / 28

ST 740: Model Selection

ST 740: Model Selection ST 740: Model Selection Alyson Wilson Department of Statistics North Carolina State University November 25, 2013 A. Wilson (NCSU Statistics) Model Selection November 25, 2013 1 / 29 Formal Bayesian Model

More information

The linear model is the most fundamental of all serious statistical models encompassing:

The linear model is the most fundamental of all serious statistical models encompassing: Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x

More information

Module 17: Bayesian Statistics for Genetics Lecture 4: Linear regression

Module 17: Bayesian Statistics for Genetics Lecture 4: Linear regression 1/37 The linear regression model Module 17: Bayesian Statistics for Genetics Lecture 4: Linear regression Ken Rice Department of Biostatistics University of Washington 2/37 The linear regression model

More information

Conjugate Analysis for the Linear Model

Conjugate Analysis for the Linear Model Conjugate Analysis for the Linear Model If we have good prior knowledge that can help us specify priors for β and σ 2, we can use conjugate priors. Following the procedure in Christensen, Johnson, Branscum,

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Module 4: Bayesian Methods Lecture 5: Linear regression

Module 4: Bayesian Methods Lecture 5: Linear regression 1/28 The linear regression model Module 4: Bayesian Methods Lecture 5: Linear regression Peter Hoff Departments of Statistics and Biostatistics University of Washington 2/28 The linear regression model

More information

A Bayesian Treatment of Linear Gaussian Regression

A Bayesian Treatment of Linear Gaussian Regression A Bayesian Treatment of Linear Gaussian Regression Frank Wood December 3, 2009 Bayesian Approach to Classical Linear Regression In classical linear regression we have the following model y β, σ 2, X N(Xβ,

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public

More information

Module 11: Linear Regression. Rebecca C. Steorts

Module 11: Linear Regression. Rebecca C. Steorts Module 11: Linear Regression Rebecca C. Steorts Announcements Today is the last class Homework 7 has been extended to Thursday, April 20, 11 PM. There will be no lab tomorrow. There will be office hours

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

An Introduction to Bayesian Linear Regression

An Introduction to Bayesian Linear Regression An Introduction to Bayesian Linear Regression APPM 5720: Bayesian Computation Fall 2018 A SIMPLE LINEAR MODEL Suppose that we observe explanatory variables x 1, x 2,..., x n and dependent variables y 1,

More information

variability of the model, represented by σ 2 and not accounted for by Xβ

variability of the model, represented by σ 2 and not accounted for by Xβ Posterior Predictive Distribution Suppose we have observed a new set of explanatory variables X and we want to predict the outcomes ỹ using the regression model. Components of uncertainty in p(ỹ y) variability

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Multivariate Normal & Wishart

Multivariate Normal & Wishart Multivariate Normal & Wishart Hoff Chapter 7 October 21, 2010 Reading Comprehesion Example Twenty-two children are given a reading comprehsion test before and after receiving a particular instruction method.

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2015 Julien Berestycki (University of Oxford) SB2a MT 2015 1 / 16 Lecture 16 : Bayesian analysis

More information

Bayesian Inference. Chapter 9. Linear models and regression

Bayesian Inference. Chapter 9. Linear models and regression Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

Matrix Approach to Simple Linear Regression: An Overview

Matrix Approach to Simple Linear Regression: An Overview Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix

More information

USEFUL PROPERTIES OF THE MULTIVARIATE NORMAL*

USEFUL PROPERTIES OF THE MULTIVARIATE NORMAL* USEFUL PROPERTIES OF THE MULTIVARIATE NORMAL* 3 Conditionals and marginals For Bayesian analysis it is very useful to understand how to write joint, marginal, and conditional distributions for the multivariate

More information

Lecture 16 : Bayesian analysis of contingency tables. Bayesian linear regression. Jonathan Marchini (University of Oxford) BS2a MT / 15

Lecture 16 : Bayesian analysis of contingency tables. Bayesian linear regression. Jonathan Marchini (University of Oxford) BS2a MT / 15 Lecture 16 : Bayesian analysis of contingency tables. Bayesian linear regression. Jonathan Marchini (University of Oxford) BS2a MT 2013 1 / 15 Contingency table analysis North Carolina State University

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School

More information

Hierarchical Modeling for Univariate Spatial Data

Hierarchical Modeling for Univariate Spatial Data Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Gibbs Sampling in Linear Models #2

Gibbs Sampling in Linear Models #2 Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling

More information

Linear Models A linear model is defined by the expression

Linear Models A linear model is defined by the expression Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Multinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is

Multinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is Multinomial Data The multinomial distribution is a generalization of the binomial for the situation in which each trial results in one and only one of several categories, as opposed to just two, as in

More information

The Multivariate Normal Distribution 1

The Multivariate Normal Distribution 1 The Multivariate Normal Distribution 1 STA 302 Fall 2014 1 See last slide for copyright information. 1 / 37 Overview 1 Moment-generating Functions 2 Definition 3 Properties 4 χ 2 and t distributions 2

More information

Bayesian Linear Regression [DRAFT - In Progress]

Bayesian Linear Regression [DRAFT - In Progress] Bayesian Linear Regression [DRAFT - In Progress] David S. Rosenberg Abstract Here we develop some basics of Bayesian linear regression. Most of the calculations for this document come from the basic theory

More information

5.2 Expounding on the Admissibility of Shrinkage Estimators

5.2 Expounding on the Admissibility of Shrinkage Estimators STAT 383C: Statistical Modeling I Fall 2015 Lecture 5 September 15 Lecturer: Purnamrita Sarkar Scribe: Ryan O Donnell Disclaimer: These scribe notes have been slightly proofread and may have typos etc

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. Rebecca Barter May 5, 2015 Linear Regression Review Linear Regression Review

More information

The Wishart distribution Scaled Wishart. Wishart Priors. Patrick Breheny. March 28. Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11

The Wishart distribution Scaled Wishart. Wishart Priors. Patrick Breheny. March 28. Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11 Wishart Priors Patrick Breheny March 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11 Introduction When more than two coefficients vary, it becomes difficult to directly model each element

More information

STA 2201/442 Assignment 2

STA 2201/442 Assignment 2 STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution

More information

More Spectral Clustering and an Introduction to Conjugacy

More Spectral Clustering and an Introduction to Conjugacy CS8B/Stat4B: Advanced Topics in Learning & Decision Making More Spectral Clustering and an Introduction to Conjugacy Lecturer: Michael I. Jordan Scribe: Marco Barreno Monday, April 5, 004. Back to spectral

More information

Bayesian Classification Methods

Bayesian Classification Methods Bayesian Classification Methods Suchit Mehrotra North Carolina State University smehrot@ncsu.edu October 24, 2014 Suchit Mehrotra (NCSU) Bayesian Classification October 24, 2014 1 / 33 How do you define

More information

Bayesian Inference for Normal Mean

Bayesian Inference for Normal Mean Al Nosedal. University of Toronto. November 18, 2015 Likelihood of Single Observation The conditional observation distribution of y µ is Normal with mean µ and variance σ 2, which is known. Its density

More information

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is

More information

Introduction to Bayesian Methods

Introduction to Bayesian Methods Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative

More information

Hierarchical Modelling for Univariate Spatial Data

Hierarchical Modelling for Univariate Spatial Data Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix

More information

Multivariate Bayesian Linear Regression MLAI Lecture 11

Multivariate Bayesian Linear Regression MLAI Lecture 11 Multivariate Bayesian Linear Regression MLAI Lecture 11 Neil D. Lawrence Department of Computer Science Sheffield University 21st October 2012 Outline Univariate Bayesian Linear Regression Multivariate

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Ma 3/103: Lecture 24 Linear Regression I: Estimation Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the

More information

November 2002 STA Random Effects Selection in Linear Mixed Models

November 2002 STA Random Effects Selection in Linear Mixed Models November 2002 STA216 1 Random Effects Selection in Linear Mixed Models November 2002 STA216 2 Introduction It is common practice in many applications to collect multiple measurements on a subject. Linear

More information

Bayesian Linear Models

Bayesian Linear Models Eric F. Lock UMN Division of Biostatistics, SPH elock@umn.edu 03/07/2018 Linear model For observations y 1,..., y n, the basic linear model is y i = x 1i β 1 +... + x pi β p + ɛ i, x 1i,..., x pi are predictors

More information

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined

More information

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator Estimation Theory Estimation theory deals with finding numerical values of interesting parameters from given set of data. We start with formulating a family of models that could describe how the data were

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

A Very Brief Summary of Bayesian Inference, and Examples

A Very Brief Summary of Bayesian Inference, and Examples A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X

More information

1 Bayesian Linear Regression (BLR)

1 Bayesian Linear Regression (BLR) Statistical Techniques in Robotics (STR, S15) Lecture#10 (Wednesday, February 11) Lecturer: Byron Boots Gaussian Properties, Bayesian Linear Regression 1 Bayesian Linear Regression (BLR) In linear regression,

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2 Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate

More information

Linear Regression (9/11/13)

Linear Regression (9/11/13) STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter

More information

Bayes: All uncertainty is described using probability.

Bayes: All uncertainty is described using probability. Bayes: All uncertainty is described using probability. Let w be the data and θ be any unknown quantities. Likelihood. The probability model π(w θ) has θ fixed and w varying. The likelihood L(θ; w) is π(w

More information

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January

More information

Gibbs Sampling in Endogenous Variables Models

Gibbs Sampling in Endogenous Variables Models Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take

More information

ST 740: Multiparameter Inference

ST 740: Multiparameter Inference ST 740: Multiparameter Inference Alyson Wilson Department of Statistics North Carolina State University September 23, 2013 A. Wilson (NCSU Statistics) Multiparameter Inference September 23, 2013 1 / 21

More information

Notes on the Multivariate Normal and Related Topics

Notes on the Multivariate Normal and Related Topics Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Bayesian Multivariate Logistic Regression

Bayesian Multivariate Logistic Regression Bayesian Multivariate Logistic Regression Sean M. O Brien and David B. Dunson Biostatistics Branch National Institute of Environmental Health Sciences Research Triangle Park, NC 1 Goals Brief review of

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Addressing ourliers 1 Addressing ourliers 2 Outliers in Multivariate samples (1) For

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Modeling Real Estate Data using Quantile Regression

Modeling Real Estate Data using Quantile Regression Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices

More information

Linear Regression and Discrimination

Linear Regression and Discrimination Linear Regression and Discrimination Kernel-based Learning Methods Christian Igel Institut für Neuroinformatik Ruhr-Universität Bochum, Germany http://www.neuroinformatik.rub.de July 16, 2009 Christian

More information

AMS-207: Bayesian Statistics

AMS-207: Bayesian Statistics Linear Regression How does a quantity y, vary as a function of another quantity, or vector of quantities x? We are interested in p(y θ, x) under a model in which n observations (x i, y i ) are exchangeable.

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

MEI Exam Review. June 7, 2002

MEI Exam Review. June 7, 2002 MEI Exam Review June 7, 2002 1 Final Exam Revision Notes 1.1 Random Rules and Formulas Linear transformations of random variables. f y (Y ) = f x (X) dx. dg Inverse Proof. (AB)(AB) 1 = I. (B 1 A 1 )(AB)(AB)

More information

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee September 03 05, 2017 Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles Linear Regression Linear regression is,

More information

Chapter 17: Undirected Graphical Models

Chapter 17: Undirected Graphical Models Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

1 Priors, Contd. ISyE8843A, Brani Vidakovic Handout Nuisance Parameters

1 Priors, Contd. ISyE8843A, Brani Vidakovic Handout Nuisance Parameters ISyE8843A, Brani Vidakovic Handout 6 Priors, Contd.. Nuisance Parameters Assume that unknown parameter is θ, θ ) but we are interested only in θ. The parameter θ is called nuisance parameter. How do we

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

The Normal Linear Regression Model with Natural Conjugate Prior. March 7, 2016

The Normal Linear Regression Model with Natural Conjugate Prior. March 7, 2016 The Normal Linear Regression Model with Natural Conjugate Prior March 7, 2016 The Normal Linear Regression Model with Natural Conjugate Prior The plan Estimate simple regression model using Bayesian methods

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Metropolis-Hastings Algorithm

Metropolis-Hastings Algorithm Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to

More information

The Relationship Between the Power Prior and Hierarchical Models

The Relationship Between the Power Prior and Hierarchical Models Bayesian Analysis 006, Number 3, pp. 55 574 The Relationship Between the Power Prior and Hierarchical Models Ming-Hui Chen, and Joseph G. Ibrahim Abstract. The power prior has emerged as a useful informative

More information

Gaussian Models (9/9/13)

Gaussian Models (9/9/13) STA561: Probabilistic machine learning Gaussian Models (9/9/13) Lecturer: Barbara Engelhardt Scribes: Xi He, Jiangwei Pan, Ali Razeen, Animesh Srivastava 1 Multivariate Normal Distribution The multivariate

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes Neil D. Lawrence GPSS 10th June 2013 Book Rasmussen and Williams (2006) Outline The Gaussian Density Covariance from Basis Functions Basis Function Representations Constructing

More information

Bayesian Inference for Regression Parameters

Bayesian Inference for Regression Parameters Bayesian Inference for Regression Parameters 1 Bayesian inference for simple linear regression parameters follows the usual pattern for all Bayesian analyses: 1. Form a prior distribution over all unknown

More information

A tutorial on. Richard Boys

A tutorial on. Richard Boys A tutorial on Bayesian inference for the normal linear model Richard Boys Statistics Research Group, Newcastle University, UK Motivation Much of Bayesian inference nowadays analyses complex hierarchical

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

Multivariate spatial models and the multikrig class

Multivariate spatial models and the multikrig class Multivariate spatial models and the multikrig class Stephan R Sain, IMAGe, NCAR ENAR Spring Meetings March 15, 2009 Outline Overview of multivariate spatial regression models Case study: pedotransfer functions

More information

Outline. Motivation Contest Sample. Estimator. Loss. Standard Error. Prior Pseudo-Data. Bayesian Estimator. Estimators. John Dodson.

Outline. Motivation Contest Sample. Estimator. Loss. Standard Error. Prior Pseudo-Data. Bayesian Estimator. Estimators. John Dodson. s s Practitioner Course: Portfolio Optimization September 24, 2008 s The Goal of s The goal of estimation is to assign numerical values to the parameters of a probability model. Considerations There are

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y

More information

Advanced Statistical Modelling

Advanced Statistical Modelling Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression Reading: Hoff Chapter 9 November 4, 2009 Problem Data: Observe pairs (Y i,x i ),i = 1,... n Response or dependent variable Y Predictor or independent variable X GOALS: Exploring

More information

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation Patrick Breheny February 8 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Large-scale testing is, of course, a big area and we could keep talking

More information

Lecture 3. Univariate Bayesian inference: conjugate analysis

Lecture 3. Univariate Bayesian inference: conjugate analysis Summary Lecture 3. Univariate Bayesian inference: conjugate analysis 1. Posterior predictive distributions 2. Conjugate analysis for proportions 3. Posterior predictions for proportions 4. Conjugate analysis

More information

Weighted Least Squares

Weighted Least Squares Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w

More information