Part 4: Multi-parameter and normal models

Size: px
Start display at page:

Download "Part 4: Multi-parameter and normal models"

Transcription

1 Part 4: Multi-parameter and normal models 1

2 The normal model Perhaps the most useful (or utilized) probability model for data analysis is the normal distribution There are several reasons for this, e.g., the CLT it is a simple model with separate parameters for the population mean and variance - two quantities that are often of primary interest

3 The normal PDF A RV Y is said to be normally distributed with mean µ and variance if the density of Y is given by f(y µ, )= 1 (y µ) p exp for 1 <y<1 We shall write Y N (µ, parameters (µ, ) ) and call the pair of 3

4 Normal properties Some important things to remember about the normal distribution include the distribution is symmetric about µ : the mode, the median and the mean are all equal to µ about 95% of the population lies within two (more precisely 1.96) standard deviations of the mean if X N (µ x, x) and Y N (µ y, y) and X and Y are independent, then ax + by N (aµ x + bµ y,a x + b y) 4

5 Joint sampling density Suppose our sampling model is {Y 1,...,Y n µ, } iid N (µ, ) Then the joint sampling density is given by ny p(y 1,...,y n µ, )= p(y i µ, ) i=1 = ny i=1 1 (yi p exp µ) =( p ) n/ exp 5 ( 1 nx i=1 (y i µ) )

6 Joint sampling density By expanding the quadratic term in the exponent nx (y i µ) nx nx i=1 we see that = 1 i=1 p(y 1,...,y n µ, y i µ ) i=1 depends upon y i + n µ y 1,...,y n through { P yi, P y i }, a two-dimensional sufficient statistic Knowing these quantities is equivalent to knowing nx nx ȳ = 1 n y i, and s = 1 (y i ȳ) i=1 n 1 i=1 and so are also a sufficient statistic {ȳ, s } 6

7 Inference by conditioning Inference for this two-parameter model can be broken down into two one-parameter problems We begin with the problem of making inference for when is known, and use a conjugate prior for µ µ For any (conditional) prior distribution p(µ ) the posterior distribution will satisfy p(µ y 1,...,y n, ) / exp ( 1 ) nx (y i µ) p(µ i=1 ) / e c 1(µ c ) p(µ 7 )

8 Conjugate prior So we see that if p(µ ) is to be conjugate, then it must include quadratic terms like exp{c 1 (µ c ) } The simplest such class of probability densities is the normal family, suggesting that if p(µ ) is normal and {y 1,...,y n } iid N (µ, ) then p(µ y 1,...,y n, ) is also a normal density 8

9 Posterior derivation If µ N (µ 0, 0 ) then where p(µ y 1,...,y n, ) / exp (µ b/a) /a a = n and b = µ P yi This function has exactly the same form as a normal density curve with 1/a playing the role of the variance and b/a playing the role of the mean 9

10 Normal posterior Since probability distributions are determined by their shape, this means that p(µ y 1,...,y n, ) is indeed a normal density i.e., {µ y 1,...,y n, } N (µ n, n) where n = 1 a = n and µ n = b a = 1 0 µ 0 + n ȳ n So: normal prior + normal sampling model = normal posterior 10

11 Combining information The (conditional) posterior parameters n and µ n combine the prior parameters 0 and µ 0 with the terms from the data First consider the posterior inverse variance, a.k.a. the precision: 1 n = n n = 1 So the posterior precision combines the sampling 0 precision and the prior precision n = 0 + n and n!1, as n!1 11

12 Combining means Notice that the posterior mean decomposes as µ n = n µ 0 + n 0 + n ȳ so it is a weighted average of the prior mean and the data mean The weight on the prior mean is precision 1/ 0, the prior 1

13 Prior sample size If the prior mean were based on apple 0 prior observations from the same (or similar) population as Y 1,...,Y n then we might want to set 0 = apple 0 which may be interpreted as the variance of the mean of the prior observations In this case the formula for the posterior mean reduces to µ n = apple 0 apple 0 + n µ 0 + n apple 0 + nȳ µ n! ȳ, as n!1 Finally, 13

14 Prediction Consider predicting a new observation population after having observed Ỹ from the {Y 1 = y 1,...,Y n = y n } To find the predictive distribution we may take advantage of the following fact {Ỹ µ, } N (µ, ), Ỹ = µ + ", " N (0, ) In other words, saying that Ỹ is normal with mean µ is the same as saying that Ỹ is equal to µ plus some mean-zero normally distributed noise 14

15 Predictive moments Using this result, let s first compute the posterior mean and variance of Ỹ E{Ỹ y 1,...,y n, } = E{µ + " y 1,...,y n, } = E{µ y 1,...,y n, = µ n +0=µ n } + E{ " y 1,...,y n, } Var[Ỹ y 1,...,y n, ] = Var[µ + " y 1,...,y n, ] = Var[µ y 1,...,y n, ] + Var[ " y 1,...,y n, ] = n + 15

16 Moments to distribution Recall that the sum of independent normal random variables is normal Therefore, since µ and ", conditional on y 1,...,y n and, are normally distributed, so is Ỹ = µ + " So the predictive distribution is Ỹ y 1,...,y n, N (µ n, n + ) Observe that, as n!1, Var[Ỹ y 1,...,y n, ]! > 0 i.e., certainty in µ does not translate into certainty about Ỹ 16

17 Example: Midge wing length Grogan and Wirth (1981) provide data on the wing length in millimeters of nine members of a species of midge (small, two-winged flies) From these measurements we wish to make inference about the population mean µ 17

18 Example: prior information Studies from other populations suggest that wing lengths are usually around 1.9mm, so we set µ 0 =1.9 We also know that lengths must be positive (µ >0) We can approximate this restriction with a normal prior distribution for µ as follows: Since most of the normal density is within two standard deviations of the mean we choose 0 so that µ 0 0 > 0 ) 0 < 1.9/ =0.95 For now we take 0 =0.95 which somewhat overstates our prior uncertainty 18

19 Example: data and posterior The observations in order of increasing magnitude are (1.64, 1.70, 1.7, 1.74, 1.8, 1.8, 1.8, 1.9,.08) Therefore the posterior is {µ y 1,...,y 9, } N (µ n, n) where µ n = 1 0 µ 0 + n ȳ n = n = n =

20 Example: a choice of If we take = s =0.017, then {µ y 1,...,y 9, =0.017} N (1.805, 0.00) density posterior prior mu A 95% CI for µ is (1.7, 1.89) 0

21 Example: full accounting of uncertainty However, these results assume that we are certain that = s when in fact s is only a rough estimate of based upon only nine observations To get a more accurate representation of our information (and in particular of our posterior uncertainty) we need to account for the fact that not known is 1

22 Joint Bayesian inference Bayesian inference for two or more parameters is not conceptually different from the one-parameter case For any joint prior distribution p(µ, ) for µ and, posterior inference proceeds using Bayes rule: p(µ, y 1,...,y n )= p(y 1,...,y n µ, )p(µ, ) p(y 1,...,y n ) As before, we will begin by developing a simple conjugate class of prior distributions which will make posterior calculations easy

23 Prior decomposition A joint distribution can always be expressed as the product of a conditional probability and a marginal probability: p(µ, )=p(µ )p( ) We just saw that if were known, then a conjugate prior distribution for µ was N (µ 0, 0 ) 3

24 Prior decomposition Lets consider the particular case where 0 = /apple 0 : p(µ, )=p(µ )p( ) = N (µ; µ 0, 0 = /apple 0 ) p( ) The parameters µ 0 and apple 0 can be interpreted as the mean and sample size from a set of prior observations For we need a family of prior distributions that has support on (0, 1) 4

25 Conjugate prior for One such family of distributions is the gamma family, as we used for the Poisson sampling model Unfortunately, this family is not conjugate for the normal variance However, the gamma family does turn out to be a conjugate class of densities for the precision 1/ When using such a prior we say that has an inversegamma distribution 1/ G(a, b) ) IG(a, b) 5

26 Inverse-gamma prior For interpretability, instead of using a and b we will use IG 0, 0 Under this parameterization: 0 E{ } = 0 mode( )= 0 0 / 0 / 1 0 / 0 /+1, so mode( ) < 0 < E{ } Var[ ] is decreasing in v 0 We can interpret the prior parameters ( 0, 0 ) as the prior sample variance and the prior sample size 6

27 Posterior inference Suppose our prior model and sampling model are IG( 0 /, 0 0 /) µ N (µ 0, /apple 0 ) Y 1,...Y n iid N (µ, ) Just as the prior distribution for µ and can be decomposed as p(µ, )=p(µ )p( ), the posterior distribution can be similarly decomposed as p(µ, y 1,...,y n )=p(µ,y 1,...,y n )p( y 1,...,y n ) 7

28 Posterior conditional(s) We have already derived the posterior conditional distribution of µ given /apple 0 0 Plugging in for we have {µ y 1,...,y n, } N (µ n, /apple n ), where apple n = apple 0 + n and µ n = (apple 0/ )µ 0 +(n/ )ȳ apple 0 / + n/ (posterior sample size and mean) = apple 0µ 0 + nȳ apple n 8

29 Posterior conditional(s) The posterior distribution of can be obtained by integrating over the unknown value of µ p( y 1,...,y n ) / p(y 1,...,y n Z )p( ) = p( ) p(y 1,...,y n µ, )p(µ ) dµ This integral can be done without much knowledge of calculus, but it is somewhat tedious 9

30 Posterior conditional(s) Check that this leads to { y 1,...,y n } IG( n /, n n /), where n = 0 + n n = 1 apple 0 0 +(n 1)s + apple 0n (ȳ µ 0 ) n apple n These formulae suggest that the interpretation of 0 as a prior sample size, from which a prior sample variance 0 has been obtained, is reasonable Similarly, we may think of 0 0 and n n as a prior and posterior sum of squares, which is the sum of the prior and data sum of squares 30

31 Posterior conditional(s) n = 1 n apple 0 0 +(n 1)s + apple 0n apple n (ȳ µ 0 ) However, the third term is a bit harder to understand It says that a large value of (ȳ µ 0 ) posterior probability of a large increases the This makes sense for our joint prior p(µ, : If we want to think of prior observations with variance is also an estimate of µ 0 as the sample mean of ), then this term 31

32 Example: the midge data The studies of other populations suggest that the true mean and standard deviation of our population under study should not be far from 1.9mm and 0.1mm respectively, suggesting µ 0 =1.9 and 0 =0.01 However, this population may be different from others in terms of wing length, and so we choose apple 0 = 0 =1 so that our prior distributions are only weakly centered around the estimates from other populations 3

33 Example: the midge data The sample mean and variance of our observed data are ȳ =1.804 and s = From these values and the prior parameters, we compute µ n = apple 0µ 0 + nȳ = apple n n = 1 apple 0 µ 0 +(n n = )s + apple 0n apple n (ȳ µ 0 ) = =0.015

34 Example: the full posterior Our joint posterior distribution is completely determined by these values µ n =1.814, apple n = 10, n =0.015, n = 10 and can be expressed as {µ y 1,...,y n, } N (1.814, /10) { y 1,...,y n } IG(10/, /) 34

35 Example: the full posterior image Observe that the contours are more peaked as a µ function of for low values of than high values 35

36 Marginal interests For many data analyses, interest primarily lies in estimating the population mean µ, and so we would like to calculate quantities like E{µ y 1,...,y n } Var[µ y 1,...,y n ] P (µ 1 <µ y 1,...,y n ) These quantities are all determined by the marginal posterior distribution of µ given the data But all we know so far is that the conditional distribution of given the data and is normal, and that µ given the data is inverse-gamma 36

37 Joint sampling If we could generate marginal samples of µ, then we could use the MC method to approximate the marginal quantities of interest It turns out that this is easy to do by generating samples of µ and from their joint posterior by MC (1) IG( n /, n n /), µ (1) N (µ n, (1) /apple n ) () IG( n /, n n /), µ () N (µ n, () /apple n ) (S) IG( n /, n n /), µ (S) N (µ n, (S) /apple n ) 37

38 Monte Carlo marginals The sequence of pairs {( (1),µ (1) ),...,( (S),µ (S) )} simulated with this procedure are independent samples from the joint posterior p(µ, y 1,...,y n ) Additionally, the simulated sequence {µ (1),...,µ (S) } can be seen as independent samples from the marginal posterior distribution p(µ y 1,...,y n ) So these may be used to make MC approximations to posterior expectations of (functions of) µ 38

39 Example: joint & marginal samples µ (s) The samples are obtained conditional on the samples of leading to joint samples (s) We may extract the marginals from the joint 39

40 Example: credible v. confidence intervals density cred conf mu The Bayesian CI is very close to the frequentist CI, based on t-statistics since p(µ y 1,...,y n ) St 0 +n(µ n, n/apple n ) apple 0 0 If and are small, this will be very close to the sampling distribution of the MLE 40

41 Objective priors? How small can apple 0 and 0, the prior sample sizes, be? Consider µ n = apple 0µ 0 + nȳ apple n n = 1 apple 0 µ 0 +(n n 1)s + apple 0n apple n (ȳ µ 0 ) So as apple 0, 0! 0 µ n! ȳ n! n n 1 s = 1 n X (yi ȳ) 41

42 Improper prior This leads to the following posterior : { n y 1,...,y n } IG, n 1 X (yi n {µ y 1,...,y n, } N ȳ, n ȳ) More formally, if we take the improper prior p(µ, ) / 1/ the posterior is proper, and we get the same posterior conditional for µ as above, but { y 1,...,y n } IG n 1, 1 X (yi ȳ) 4

43 Classical comparison The marginal posterior for µ may then be obtained as p(µ y 1,...,y n ) Z (cond. prob) (LTP) = p(µ,y 1,...,y n )p( y 1,...,y n ) d (normal conditional) (IG conditional) / St n 1 (µ;ȳ, s /n) It is interesting to compare this result to the sampling distribution of the t-statistic, i.e., of the MLE t = Ȳ µ s/ p n µ t n 1, ˆµ = Ȳ St n 1(µ, s /n) 43

44 Point estimator A point estimator of an unknown parameter is a function that converts your data into a single element of the parameter space In the case of the normal sampling model and conjugate prior distribution, the posterior mean estimator of µ is ˆµ b (y 1,...,y n )=E{µ y 1,...,y n } = n apple 0 + nȳ + apple 0 apple 0 + n µ 0 = wȳ +(1 w)µ 0 44

45 Sampling properties The sampling properties of an estimator such as ˆµ b refers to its behavior under hypothetically repeatable surveys or experiments Lets compare the sampling properties of ˆµ b to ˆµ e (y 1,...,y n )=ȳ, the sample mean, when the true population mean is µ true E{ˆµ e µ = µ true } = µ true E{ˆµ b µ = µ true } = wµ true +(1 w)µ 0 So we say that ˆµ e is unbiased and, unless µ 0 = µ true, we say that ˆµ b is biased 45

46 Bias Bias refers to how close the center of mass of the sampling distribution of the estimator is to the true value An unbiased estimator is an estimator with zero bias, which sounds desirable However, bias does not tell us how far away an estimate might be from the true value For example, ˆµ = y 1 is an unbiased estimator of the population mean µ true, but will generally be farther away from µ true than ȳ 46

47 Mean squared error To evaluate how close an estimator ˆ is likely to be to the true value error (MSE) true, we might use the mean squared Letting m = E{ˆ true }, the MSE is MSE[ˆ true ]=E{( true ) = true } = E{(ˆ m) = true } +(m true ) = Var[ˆ = true ] + Bias [ˆ = true ] 47

48 Mean squared error This means that, before the data are gathered, the expected distance from the estimator to the true value depends on how close distribution of trueˆ is to the center of the (the bias), and how spread out the distribution is (the variance) 48

49 Comparing estimators Getting back to our comparison of ˆµ b to ˆµ e, the bias of ˆµ e is zero, but Var[ˆµ e µ = µ true, ]= n, whereas Var[ˆµ b µ = µ true, ]=w n < n and so ˆµ b has lower variability 49

50 Comparing estimators Which one is better in terms of MSE? MSE[ˆµ e µ = µ true ] = Var[ˆµ e µ = µ true ]= MSE[ˆµ b µ = µ true ] = Var[ˆµ b µ = µ true ] + Bias [ˆµ b µ = µ true ] = w n +(1 w) (µ 0 µ truth ) Therefore MSE[ˆµ b µ = µ true ] < MSE[ˆµ e µ = µ true ] n if (µ 0 µ true ) < n 1+w 1 w = 1 n + apple 0 50

51 Low MSE Bayesian estimators So if you know just a little about the population you are about to sample from, you should be able to find values of µ 0 and apple 0 such that the Bayesian estimator has lower average distance to the truth (MSE) than the MLE E.g., if you are pretty sure that your prior guess µ 0 is within two standard deviations of the true population mean, then if you pick apple 0 =1 you can be pretty sure that the Bayes estimator has a lower MSE since (µ 0 µ true ) < 4 51

52 Example: IQ scores Scoring on IQ tests is designed to produce a normal distribution with a mean of 100 and a variance of 5 when applied to the general population Suppose that we were to sample n individuals from a particular town in the USA and then use it to estimate µ, the town-specific mean IQ score For a Bayesian estimation, if we lack much information about the town in question, a natural choice of would be µ 0 = 100 5

53 Example: IQ scores, MSE Suppose that, unknown to us, the people in this town are extremely exceptional and the true mean and variance of IQ scores in the town are (µ true = 11, = 169) The MSEs of the estimators ˆµ e and ˆµ b are MSE[ˆµ e µ = 11, MSE[ˆµ b µ = 11, = 169] = Var[ˆµ e µ = 11, = 169 n = 169] = Var[ˆµ b µ = 11, = 169] = 169] + Bias [ˆµ b µ = 11, = 169] = w n +(1 w) 144

54 Example: Relative MSE One way compare MSEs is through their ratio. Consider MSE[ˆµ b µ = 11]/MSE[ˆµ e µ = 11] plotted as a function of n, for k 0 {1,, 3} when k 0 {1, }, relative MSE k0 = 0 k0 = 1 k0 = k0 = 3 the Bayes estimate has lower MSE than the MLE the µ 0 = 100 setting is only bad for apple 0 3 as n increases, the bias of the estimators shrinks to zero sample size 54

55 Example: Sampling densities Consider the sampling densities when n = 10 which highlight the relative contributions of the bias and variance to the MSE density k0 = 0 k0 = 1 k0 = k0 = 3 truth mean IQ 55

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Part 2: One-parameter models

Part 2: One-parameter models Part 2: One-parameter models 1 Bernoulli/binomial models Return to iid Y 1,...,Y n Bin(1, ). The sampling model/likelihood is p(y 1,...,y n ) = P y i (1 ) n P y i When combined with a prior p( ), Bayes

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Chapter 8: Sampling distributions of estimators Sections

Chapter 8: Sampling distributions of estimators Sections Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample variance Skip: p.

More information

Estimators as Random Variables

Estimators as Random Variables Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maimum likelihood Consistency Confidence intervals Properties of the mean estimator Introduction Up until

More information

Predictive Distributions

Predictive Distributions Predictive Distributions October 6, 2010 Hoff Chapter 4 5 October 5, 2010 Prior Predictive Distribution Before we observe the data, what do we expect the distribution of observations to be? p(y i ) = p(y

More information

Part 7: Hierarchical Modeling

Part 7: Hierarchical Modeling Part 7: Hierarchical Modeling!1 Nested data It is common for data to be nested: i.e., observations on subjects are organized by a hierarchy Such data are often called hierarchical or multilevel For example,

More information

Bernoulli and Poisson models

Bernoulli and Poisson models Bernoulli and Poisson models Bernoulli/binomial models Return to iid Y 1,...,Y n Bin(1, ). The sampling model/likelihood is p(y 1,...,y n ) = P y i (1 ) n P y i When combined with a prior p( ), Bayes rule

More information

Bayesian Inference for Normal Mean

Bayesian Inference for Normal Mean Al Nosedal. University of Toronto. November 18, 2015 Likelihood of Single Observation The conditional observation distribution of y µ is Normal with mean µ and variance σ 2, which is known. Its density

More information

Module 22: Bayesian Methods Lecture 9 A: Default prior selection

Module 22: Bayesian Methods Lecture 9 A: Default prior selection Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington Outline Jeffreys prior Unit information priors Empirical

More information

An Introduction to Bayesian Linear Regression

An Introduction to Bayesian Linear Regression An Introduction to Bayesian Linear Regression APPM 5720: Bayesian Computation Fall 2018 A SIMPLE LINEAR MODEL Suppose that we observe explanatory variables x 1, x 2,..., x n and dependent variables y 1,

More information

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1 Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maximum likelihood Consistency Confidence intervals Properties of the mean estimator Properties of the

More information

Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2

Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Logistics CSE 446: Point Estimation Winter 2012 PS2 out shortly Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Last Time Random variables, distributions Marginal, joint & conditional

More information

A Bayesian Treatment of Linear Gaussian Regression

A Bayesian Treatment of Linear Gaussian Regression A Bayesian Treatment of Linear Gaussian Regression Frank Wood December 3, 2009 Bayesian Approach to Classical Linear Regression In classical linear regression we have the following model y β, σ 2, X N(Xβ,

More information

Primer on statistics:

Primer on statistics: Primer on statistics: MLE, Confidence Intervals, and Hypothesis Testing ryan.reece@gmail.com http://rreece.github.io/ Insight Data Science - AI Fellows Workshop Feb 16, 018 Outline 1. Maximum likelihood

More information

Bayesian data analysis in practice: Three simple examples

Bayesian data analysis in practice: Three simple examples Bayesian data analysis in practice: Three simple examples Martin P. Tingley Introduction These notes cover three examples I presented at Climatea on 5 October 0. Matlab code is available by request to

More information

Bias. Definition. One of the foremost expectations of an estimator is that it gives accurate estimates.

Bias. Definition. One of the foremost expectations of an estimator is that it gives accurate estimates. Bias One of the foremost expectations of an estimator is that it gives accurate estimates. Definition An estimator ˆ is said to be accurate or unbiased if and only if µˆ = E(ˆ ) = 0 (19) In plain language,

More information

Bayesian performance

Bayesian performance Bayesian performance Frequentist properties of estimators refer to the performance of an estimator (say the posterior mean) over repeated experiments under the same conditions. The posterior distribution

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Physics 403. Segev BenZvi. Credible Intervals, Confidence Intervals, and Limits. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Credible Intervals, Confidence Intervals, and Limits. Department of Physics and Astronomy University of Rochester Physics 403 Credible Intervals, Confidence Intervals, and Limits Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Summarizing Parameters with a Range Bayesian

More information

Machine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables?

Machine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables? Linear Regression Machine Learning CSE546 Sham Kakade University of Washington Oct 4, 2016 1 What about continuous variables? Billionaire says: If I am measuring a continuous variable, what can you do

More information

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables?

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables? Linear Regression Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2014 1 What about continuous variables? n Billionaire says: If I am measuring a continuous variable, what

More information

Bayesian Inference and Decision Theory

Bayesian Inference and Decision Theory Bayesian Inference and Decision Theory Instructor: Kathryn Blackmond Laskey Room 4 ENGR (703) 993-644 Office Hours: Thursday 4:00-6:00 PM, or by appointment Spring 08 Unit 5: The Normal Model Unit 5 -

More information

F & B Approaches to a simple model

F & B Approaches to a simple model A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys

More information

A noninformative Bayesian approach to domain estimation

A noninformative Bayesian approach to domain estimation A noninformative Bayesian approach to domain estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu August 2002 Revised July 2003 To appear in Journal

More information

David Giles Bayesian Econometrics

David Giles Bayesian Econometrics David Giles Bayesian Econometrics 1. General Background 2. Constructing Prior Distributions 3. Properties of Bayes Estimators and Tests 4. Bayesian Analysis of the Multiple Regression Model 5. Bayesian

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices.

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1. What is the difference between a deterministic model and a probabilistic model? (Two or three sentences only). 2. What is the

More information

One-parameter models

One-parameter models One-parameter models Patrick Breheny January 22 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/17 Introduction Binomial data is not the only example in which Bayesian solutions can be worked

More information

Parameter estimation Conditional risk

Parameter estimation Conditional risk Parameter estimation Conditional risk Formalizing the problem Specify random variables we care about e.g., Commute Time e.g., Heights of buildings in a city We might then pick a particular distribution

More information

Estimation, Inference, and Hypothesis Testing

Estimation, Inference, and Hypothesis Testing Chapter 2 Estimation, Inference, and Hypothesis Testing Note: The primary reference for these notes is Ch. 7 and 8 of Casella & Berger 2. This text may be challenging if new to this topic and Ch. 7 of

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

ECE531 Lecture 10b: Maximum Likelihood Estimation

ECE531 Lecture 10b: Maximum Likelihood Estimation ECE531 Lecture 10b: Maximum Likelihood Estimation D. Richard Brown III Worcester Polytechnic Institute 05-Apr-2011 Worcester Polytechnic Institute D. Richard Brown III 05-Apr-2011 1 / 23 Introduction So

More information

HT Introduction. P(X i = x i ) = e λ λ x i

HT Introduction. P(X i = x i ) = e λ λ x i MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Bayesian Inference. Chapter 9. Linear models and regression

Bayesian Inference. Chapter 9. Linear models and regression Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

POLI 8501 Introduction to Maximum Likelihood Estimation

POLI 8501 Introduction to Maximum Likelihood Estimation POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,

More information

Module 4: Bayesian Methods Lecture 9 A: Default prior selection. Outline

Module 4: Bayesian Methods Lecture 9 A: Default prior selection. Outline Module 4: Bayesian Methods Lecture 9 A: Default prior selection Peter Ho Departments of Statistics and Biostatistics University of Washington Outline Je reys prior Unit information priors Empirical Bayes

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

CS281A/Stat241A Lecture 22

CS281A/Stat241A Lecture 22 CS281A/Stat241A Lecture 22 p. 1/4 CS281A/Stat241A Lecture 22 Monte Carlo Methods Peter Bartlett CS281A/Stat241A Lecture 22 p. 2/4 Key ideas of this lecture Sampling in Bayesian methods: Predictive distribution

More information

Introduction to Applied Bayesian Modeling. ICPSR Day 4

Introduction to Applied Bayesian Modeling. ICPSR Day 4 Introduction to Applied Bayesian Modeling ICPSR Day 4 Simple Priors Remember Bayes Law: Where P(A) is the prior probability of A Simple prior Recall the test for disease example where we specified the

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

Lecture 3. Univariate Bayesian inference: conjugate analysis

Lecture 3. Univariate Bayesian inference: conjugate analysis Summary Lecture 3. Univariate Bayesian inference: conjugate analysis 1. Posterior predictive distributions 2. Conjugate analysis for proportions 3. Posterior predictions for proportions 4. Conjugate analysis

More information

F denotes cumulative density. denotes probability density function; (.)

F denotes cumulative density. denotes probability density function; (.) BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models

More information

Choosing among models

Choosing among models Eco 515 Fall 2014 Chris Sims Choosing among models September 18, 2014 c 2014 by Christopher A. Sims. This document is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported

More information

Evaluating the Performance of Estimators (Section 7.3)

Evaluating the Performance of Estimators (Section 7.3) Evaluating the Performance of Estimators (Section 7.3) Example: Suppose we observe X 1,..., X n iid N(θ, σ 2 0 ), with σ2 0 known, and wish to estimate θ. Two possible estimators are: ˆθ = X sample mean

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Multivariate Normal & Wishart

Multivariate Normal & Wishart Multivariate Normal & Wishart Hoff Chapter 7 October 21, 2010 Reading Comprehesion Example Twenty-two children are given a reading comprehsion test before and after receiving a particular instruction method.

More information

The binomial model. Assume a uniform prior distribution on p(θ). Write the pdf for this distribution.

The binomial model. Assume a uniform prior distribution on p(θ). Write the pdf for this distribution. The binomial model Example. After suspicious performance in the weekly soccer match, 37 mathematical sciences students, staff, and faculty were tested for the use of performance enhancing analytics. Let

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

2 Statistical Estimation: Basic Concepts

2 Statistical Estimation: Basic Concepts Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:

More information

Introduction to Bayesian Methods

Introduction to Bayesian Methods Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative

More information

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation Patrick Breheny February 8 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Large-scale testing is, of course, a big area and we could keep talking

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

10. Exchangeability and hierarchical models Objective. Recommended reading

10. Exchangeability and hierarchical models Objective. Recommended reading 10. Exchangeability and hierarchical models Objective Introduce exchangeability and its relation to Bayesian hierarchical models. Show how to fit such models using fully and empirical Bayesian methods.

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Eco517 Fall 2014 C. Sims FINAL EXAM

Eco517 Fall 2014 C. Sims FINAL EXAM Eco517 Fall 2014 C. Sims FINAL EXAM This is a three hour exam. You may refer to books, notes, or computer equipment during the exam. You may not communicate, either electronically or in any other way,

More information

Generalized Bayesian Inference with Sets of Conjugate Priors for Dealing with Prior-Data Conflict

Generalized Bayesian Inference with Sets of Conjugate Priors for Dealing with Prior-Data Conflict Generalized Bayesian Inference with Sets of Conjugate Priors for Dealing with Prior-Data Conflict Gero Walter Lund University, 15.12.2015 This document is a step-by-step guide on how sets of priors can

More information

STAT 830 Bayesian Estimation

STAT 830 Bayesian Estimation STAT 830 Bayesian Estimation Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Bayesian Estimation STAT 830 Fall 2011 1 / 23 Purposes of These

More information

Generalized Linear Models Introduction

Generalized Linear Models Introduction Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Introduction to Bayesian Inference

Introduction to Bayesian Inference University of Pennsylvania EABCN Training School May 10, 2016 Bayesian Inference Ingredients of Bayesian Analysis: Likelihood function p(y φ) Prior density p(φ) Marginal data density p(y ) = p(y φ)p(φ)dφ

More information

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming

More information

Consider the joint probability, P(x,y), shown as the contours in the figure above. P(x) is given by the integral of P(x,y) over all values of y.

Consider the joint probability, P(x,y), shown as the contours in the figure above. P(x) is given by the integral of P(x,y) over all values of y. ATMO/OPTI 656b Spring 009 Bayesian Retrievals Note: This follows the discussion in Chapter of Rogers (000) As we have seen, the problem with the nadir viewing emission measurements is they do not contain

More information

Unobservable Parameter. Observed Random Sample. Calculate Posterior. Choosing Prior. Conjugate prior. population proportion, p prior:

Unobservable Parameter. Observed Random Sample. Calculate Posterior. Choosing Prior. Conjugate prior. population proportion, p prior: Pi Priors Unobservable Parameter population proportion, p prior: π ( p) Conjugate prior π ( p) ~ Beta( a, b) same PDF family exponential family only Posterior π ( p y) ~ Beta( a + y, b + n y) Observed

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Bayesian Inference for Regression Parameters

Bayesian Inference for Regression Parameters Bayesian Inference for Regression Parameters 1 Bayesian inference for simple linear regression parameters follows the usual pattern for all Bayesian analyses: 1. Form a prior distribution over all unknown

More information

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013 Bayesian Methods Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2013 1 What about prior n Billionaire says: Wait, I know that the thumbtack is close to 50-50. What can you

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Karl Oskar Ekvall Galin L. Jones University of Minnesota March 12, 2019 Abstract Practically relevant statistical models often give rise to probability distributions that are analytically

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007 Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

Bayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017

Bayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017 Bayesian inference Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 10, 2017 1 / 22 Outline for today A genetic example Bayes theorem Examples Priors Posterior summaries

More information

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School

More information

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).

More information

Basic Concepts of Inference

Basic Concepts of Inference Basic Concepts of Inference Corresponds to Chapter 6 of Tamhane and Dunlop Slides prepared by Elizabeth Newton (MIT) with some slides by Jacqueline Telford (Johns Hopkins University) and Roy Welsch (MIT).

More information

Special Topic: Bayesian Finite Population Survey Sampling

Special Topic: Bayesian Finite Population Survey Sampling Special Topic: Bayesian Finite Population Survey Sampling Sudipto Banerjee Division of Biostatistics School of Public Health University of Minnesota April 2, 2008 1 Special Topic Overview Scientific survey

More information

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Matthew S. Johnson New York ASA Chapter Workshop CUNY Graduate Center New York, NY hspace1in December 17, 2009 December

More information

Classical and Bayesian inference

Classical and Bayesian inference Classical and Bayesian inference AMS 132 Claudia Wehrhahn (UCSC) Classical and Bayesian inference January 8 1 / 11 The Prior Distribution Definition Suppose that one has a statistical model with parameter

More information

Remarks on Improper Ignorance Priors

Remarks on Improper Ignorance Priors As a limit of proper priors Remarks on Improper Ignorance Priors Two caveats relating to computations with improper priors, based on their relationship with finitely-additive, but not countably-additive

More information

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling Due: Tuesday, May 10, 2016, at 6pm (Submit via NYU Classes) Instructions: Your answers to the questions below, including

More information

Bayesian Inference. Chapter 2: Conjugate models

Bayesian Inference. Chapter 2: Conjugate models Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in

More information

Example continued. Math 425 Intro to Probability Lecture 37. Example continued. Example

Example continued. Math 425 Intro to Probability Lecture 37. Example continued. Example continued : Coin tossing Math 425 Intro to Probability Lecture 37 Kenneth Harris kaharri@umich.edu Department of Mathematics University of Michigan April 8, 2009 Consider a Bernoulli trials process with

More information

Machine Learning Lecture 2

Machine Learning Lecture 2 Machine Perceptual Learning and Sensory Summer Augmented 15 Computing Many slides adapted from B. Schiele Machine Learning Lecture 2 Probability Density Estimation 16.04.2015 Bastian Leibe RWTH Aachen

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Uncertainty Quantification for Inverse Problems. November 7, 2011

Uncertainty Quantification for Inverse Problems. November 7, 2011 Uncertainty Quantification for Inverse Problems November 7, 2011 Outline UQ and inverse problems Review: least-squares Review: Gaussian Bayesian linear model Parametric reductions for IP Bias, variance

More information

Inferring from data. Theory of estimators

Inferring from data. Theory of estimators Inferring from data Theory of estimators 1 Estimators Estimator is any function of the data e(x) used to provide an estimate ( a measurement ) of an unknown parameter. Because estimators are functions

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Bayesian Linear Regression [DRAFT - In Progress]

Bayesian Linear Regression [DRAFT - In Progress] Bayesian Linear Regression [DRAFT - In Progress] David S. Rosenberg Abstract Here we develop some basics of Bayesian linear regression. Most of the calculations for this document come from the basic theory

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative

More information

Contents 1. Contents

Contents 1. Contents Contents 1 Contents 6 Distributions of Functions of Random Variables 2 6.1 Transformation of Discrete r.v.s............. 3 6.2 Method of Distribution Functions............. 6 6.3 Method of Transformations................

More information