MAS3301 Bayesian Statistics Problems 5 and Solutions

Size: px
Start display at page:

Download "MAS3301 Bayesian Statistics Problems 5 and Solutions"

Transcription

1 MAS3301 Bayesian Statistics Problems 5 and Solutions Semester Problems 5 1. (Some of this question is also in Problems 4). I recorded the attendance of students at tutorials for a module. Suppose that we can, in some sense, regard the students as a sample from some population of students so that, for example, we can learn about the likely behaviour of next year s students by observing this year s. At the time I recorded the data we had had tutorials in Week and Week 4. Let the probability that a student attends in both weeks be θ 11, the probability that a student attends in week but not Week 4 be θ 10 and so on. The data are as follows. Attendance Probability Observed frequency Week and Week 4 θ 11 n 11 = 5 Week but not Week 4 θ 10 n 10 = 7 Week 4 but not Week θ 01 n 01 = 6 Neither week θ 00 n 00 = 13 Suppose that the prior distribution for (θ 11, θ 10, θ 01, θ 00 ) is a Dirichlet distribution with density proportional to θ 3 11θ 10 θ 01 θ 00. (a) Find the prior means and prior variances of θ 11, θ 10, θ 01, θ 00. (b) Find the posterior distribution. (c) Find the posterior means and posterior variances of θ 11, θ 10, θ 01, θ 00. (d) Using the R function hpdbeta which may be obtained from the Web page (or otherwise), find a 95% posterior hpd interval, based on the exact posterior distribution, for θ 00. (e) Find an approximate 95% hpd interval for θ 00 using a normal approximation based on the posterior mode and the partial second derivatives of the log posterior density. Compare this with the exact hpd interval. Hint: To find the posterior mode you will need to introduce a Lagrange multiplier. (f) The population mean number of attendances out of two is µ = θ 11 + θ 10 + θ 01. Find the posterior mean of µ and an approximation to the posterior standard deviation of µ.. Samples are taken from twenty wagonloads of an industrial mineral and analysed. The amounts in ppm (parts per million) of an impurity are found to be as follows We regard these as independent samples from a normal distribution with mean µ and variance σ = τ 1. Find a 95% posterior hpd interval for µ under each of the following two conditions. 1

2 (a) The value of τ is known to be 0.1 and our prior distribution for µ is normal with mean 60.0 and standard deviation 0.0. (b) The value of τ is unknown. Our prior distribution for τ is a gamma distribution with mean 0.1 and standard deviation Our conditional prior distribution for µ given τ is normal with mean 60.0 and precision 0.05τ (that is, standard deviation 40τ 1/ ). 3. We observe a sample of 30 observations from a normal distribution with mean µ and precision τ. The data, y 1,..., y 30, are such that 30 y i = 67 and 30 y i = (a) Suppose that the value of τ is known to be 0.04 and that our prior distribution for µ is normal with mean 0 and variance 100. Find the posterior distribution of µ and evaluate a posterior 95% hpd interval for µ. (b) Suppose that we have a gamma(1, 10) prior distribution for τ and our conditional prior distribution for µ given τ is normal with mean 0 and variance (0.1τ) 1. Find the marginal posterior distribution for τ, the marginal posterior distribution for µ and the marginal posterior 95% hpd interval for µ. 4. The following data come from the experiment reported by MacGregor et al. (1979). They give the supine systolic blood pressures (mm Hg) for fifteen patients with moderate essential hypertension. The measurements were taken immediately before and two hours after taking a drug. Patient Before After Patient Before After We are interested in the effect of the drug on blood pressure. We assume that, given parameters µ, τ, the changes in blood pressure, from before to after, in the n patients are independent and normally distributed with unknown mean µ and unknown precision τ. The fifteen differences are as follows Our prior distribution for τ is a gamma(0.35, 1.01) distribution. Our conditional prior distribution for µ given τ is a normal N(0, [0.003τ] 1 ) distribution. (a) Find the marginal posterior distribution of τ. (b) Find the marginal posterior distribution of µ. (c) Find the marginal posterior 95% hpd interval for µ. (d) Comment on what you can conclude about the effect of the drug. 5. The lifetimes of certain components are supposed to follow a Weibull distribution with known shape parameter α =. The probability density function of the lifetime distribution is for 0 < t <. f(t) = αρ t exp[ (ρt) ] We will observe a sample of n such lifetimes where n is large.

3 (a) Assuming that the prior density is nonzero and reasonably flat so that it may be disregarded, find an approximation to the posterior distribution of ρ. Find an approximate 95% hpd interval for ρ when n = 300, log(t) = and t = (b) Assuming that the prior distribution is a gamma(a, b) distribution, find an approximate 95% hpd interval for ρ, taking into account this prior, when a =, b = 100, n = 300, log(t) = and t = Given the value of λ, the number X i of transactions made by customer i at an online store in a year has a Poisson(λ) distribution, with X i independent of X j for i j. The value of λ is unknown. Our prior distribution for λ is a gamma(5,1) distribution. We observe the numbers of transactions in a year for 45 customers and 45 x i = 18. (a) Using a χ table (i.e. without a computer) find the lower.5% point and the upper.5% point of the prior distribution of λ. (These bound a 95% symmetric prior credible interval). (b) Find the posterior distribution of λ. (c) Using a normal approximation to the posterior distribution, based on the posterior mean and variance, find a 95% symmetric posterior credible interval for λ. (d) Find an expression for the posterior predictive probability that a customer makes m transactions in a year. (e) As well as these ordinary customers, we believe that there is a second group of individuals. The number of transactions in a year for a member of this second group has, given θ, a Poisson(θ) distribution and our beliefs about the value of θ are represented by a gamma(1,0.05) distribution. A new individual is observed who makes 10 transactions in a year. Given that our prior probability that this is an ordinary customer is 0.9, find our posterior probability that this is an ordinary customer. Hint: You may find it best to calculate the logarithms of the predictive probabilities before exponentiating these. For this you might find the R function lgamma useful. It calculates the log of the gamma function. Alternatively it is possible to do the calculation using the R function dnbinom. (N.B. In reality a slightly more complicated model is used in this type of application). 7. The following data give the heights in cm of 5 ten-year-old children. We assume that, given the values of µ and τ, these are independent observations from a normal distribution with mean µ and variance τ (a) Assuming that the value of τ 1 is known to be 64 and our prior distribution for µ is normal with mean 55 and standard deviation 5, find a 95% hpd interval for the height in cm of another ten-year-old child drawn from the same population. (b) Assume now that the value of τ is unknown but we have a prior distribution for it which is a gamma(,18) distribution and our conditional prior distribution for µ given τ is normal with mean 55 and variance (.56τ) 1. Find a 95% hpd interval for the height in cm of another ten-year-old child drawn from the same population. 3

4 8. A random sample of n = 1000 people was chosen from a large population. Each person was asked whether they approved of a proposed new law. The number answering Yes was x = 37. (For the purpose of this exercise all other responses and non-responses are teated as simply Not Yes ). Assume that x is an observation from the binomial(n, p) distribution where p is the unknown proportion of people in the population who would answer Yes. Our prior distribution for p is a uniform distribution on (0, 1). Let p = Φ(θ) so θ = Φ 1 (p) where Φ(y) is the standard normal distribution function and Φ 1 (z) is its inverse. (a) Find the maximum likelihood estimate of p and hence find the maximum likelihood estimate of θ. (b) Disregarding the prior distribution, find a large-sample approximation to the posterior distribution of θ. (c) Using your approximate posterior distribution for θ, find an approximate 95% hpd interval for θ. (d) Use the exact posterior distribution for p to find the actual posterior probability that θ is inside your approximate hpd interval. Notes: The standard normal distribution function Φ(x) = x φ(u) du where φ(u) = (π) 1/ exp{ u /}. Let l be the log-likelihood. Then and Derivatives of p : d l dθ = d { } dl dp dθ dp dθ You can evaluate Φ 1 (u) using R with qnorm(u,0,1) and φ(u) is given by dnorm(u,0,1) dl dθ = dl dp dp dθ = d { } dl dp dθ dp dθ + dl d p dp dθ ( ) = d l dp dp + dl d p dθ dp dθ dp dθ = φ(θ) d p dθ = θφ(θ) 9. The amounts of rice, by weight, in 0 nominally 500g packets are determined. The weights, in g, are as follows Assume that, given the values of parameters µ, τ, the weights are independent and each has a normal N(µ, τ) distribution. The values of µ and τ are unknown. Our prior distribution is as follows. We have a gamma(, 9) prior distribution for τ and a N(500, (0.005τ) 1 ) conditional prior distribution for µ given τ. 4

5 (a) Find the posterior probability that µ < 495. (b) Find the posterior predictive probability that a new packet of rice will contain less than 500g of rice 10. A machine which is used in a manufacturing process jams from time to time. It is thought that the frequency of jams might change over time as the machine becomes older. Once every three months the number of jams in a day is counted. The results are as follows. Observation i Age of machine t i (months) Number of jams y i Our model is as follows. Given the values of two parameters α, β, the number of jams y i on a dat when the machine has age t i months has a Poisson distribution where y i Poisson(λ i ) log e (λ i ) = α + βt i. Assume that the effect of our prior distribution on the posterior distribution is negligible and that large-sample approximations may be used. (a) Let the values of α and β which maximise the likelihood be ˆα and ˆβ. Assuming that the likelihood is differentiable at its maximum, show that these satisfy the following two equations where 8 (ˆλ i y i ) = 0 8 t i (ˆλ i y i ) = 0 log e (ˆλ i ) = ˆα + ˆβt i and show that these equations are satisfied (to a good approximation) by ˆα =.55 and ˆβ = (You may use R to help with the calculations, but show your commands). You may assume from now on that these values maximise the likelihood. (b) Find an approximate symmetric 95% posterior interval for α + 4β. (c) Find an approximate symmetric 95% posterior interval for exp(α + 4β), the mean jam-rate per day at age 4 months. (You may use R to help with the calculations, but show your commands). Homework 5 Solutions to Questions 9 and 10 of Problems 5 are to be submitted in the Homework Letterbox no later than 4.00pm on Tuesday May 5th. Reference MacGregor, G.A., Markandu, N.D., Roulston, J.E. and Jones, J.C., Essential hypertension: effect of an oral inhibitor of angiotensin-converting enzyme. British Medical Journal,,

6 Solutions 1. (a) Prior distribution is Dirichlet(4,,,3). So A 0 = = 11. The prior means are The prior variances are Prior means: a 0,i A 0. a 0,i a 0,i (A 0 + 1)A 0 A 0 (A 0 + 1). θ 11 : θ 10 : θ 01 : θ 00 : = = = = 0.77 Prior variances: θ 11 : θ 10 : θ 01 : θ 00 : = = = = NOTE: Suppose that we are given prior means m 1,..., m 4 and one prior standard deviation s 1. Then a 0i = m i A 0 and Hence s m 1 A 0 1 = m 1A 0 (A 0 + 1)A 0 A 0 (A 0 + 1) = m 1(1 m 1 ). A A = m 1(1 m 1 ) s 1 = ( ) = 1. Hence A 0 = 11 and a 0i = 11m i. For example a 01 = = 4. (b) Posterior distribution is Dirichlet(4+5, +7, +6, 3+13). That is Dirichlet(9,9,8,16). (c) Now A 1 = = 6. The posterior means are The posterior variances are a 1,i A 1. a 1,i a 1,i (A 1 + 1)A 1 A 1 (A 1 + 1). 6

7 Posterior means: Posterior variances: θ 11 : θ 10 : θ 01 : θ 00 : = = = = θ 11 : θ 10 : θ 01 : θ 00 : = = = = (d) Posterior distribution for θ 00 is beta(16, 6 16). That is beta(16,46). Using the R command hpdbeta(0.95,16,46) gives < θ 00 < (e) The log posterior density is (apart from a constant) 4 (a 1,j 1) log θ j. j=1 Add λ( j j=1 θ j 1) to this and differentiate wrt θ j then set the derivative equal to zero. This gives a 1,j 1 ˆθ j + λ = 0 which leads to However 4 j=1 θ j = 1 so so and ˆθ j = (a 1,j 1). λ 4 j=1 λ = (a 1,j 1) λ = 1 4 (a 1,j 1) j=1 ˆθ j = a 1,j 1 a1,k 4. Hence the posterior mode for θ 00 is ˆθ 00 = =

8 The second derivatives of the log likelihood are l θ j = a 1,j 1 θ j and l θ j θ k = 0. Since the mixed partial second derivatives are zero, the information matrix is diagonal and the posterior variance of θ j is approximately ˆθ j a 1,j 1 = (a 1,j 1) (a 1,j 1)( a 1,k 4) = (a i,j 1 ( a 1,k 4). The posterior variance of θ 00 is approximately = The approximate 95% hpd interval is ± that is < θ 00 < This is a little wider than the exact interval. (f) Approximation based on posterior mode and curvature: Posterior modes: 8 8 θ 11 : θ 10 : θ 01 : So, approx. posterior mean of µ is Approx. posterior variances: θ 11 : = = θ 10 : 8 58 θ 01 : Since the (approx.) covariances are all zero, the approx. posterior variance of µ is = = so approx. standard deviation is = N.B. There is an alternative exact calculation, as follows, which is also acceptable. Posterior mean: Posterior covariances: = a 1,ja 1,k A 1 (A 1 + 1) var(µ) = 4var(θ 11 ) + var(θ 10 ) + var(θ 01 ) +4covar(θ 11, θ 10 ) + 4covar(θ 11, θ 01 ) + covar(θ 10, θ 01 ) ( ) ( 9 = ) ( ) 63 6 ( ) ( ) ( ) = = So the standard deviation is The difference is quite big! 8

9 . From the data ȳ = n y i = s n = 1 n n (y i ȳ) = 1 n n yi = { } = (a) Prior mean: M 0 = 60.0 Prior precision: P 0 = 1/0 = Data precision: P d = nτ = = Posterior precision: P 1 = P 0 + P d =.005 Posterior mean: M 1 =.005 Posterior std. dev.: = % hpd interval: ± That is (b) Prior τ gamma(d/, dv/) where < µ < d/ dv/ = 1 v = 0.1 = so v = 10 and d/ (dv/) = 1 v d = 0.05 so /d = 0.5 so /d = 0.5 so d = 8. Hence d 0 = 8 v 0 = 10 c 0 = 0.05 m 0 = 60.0 m 1 = c 0m 0 + nȳ = c 0 + n 0.05 c 1 = c 0 + n = 0.05 d 1 = d 0 + n = 8 r = 1 n (yi m 0 ) = (ȳ m 0 ) + s n = ( ) = = % hpd interval: v d = c 0r + ns n c 0 + n v 1 = d 0v 0 + nv d d 0 + n = = M 1 ± t 8 v1 c 1 9

10 That is That is M 1 ± < µ < (a) We have P 0 = 0.01 P d = nτ = = 1. P 1 = = 1.1 M 0 = 0 ȳ =.4 M 1 = P 0M 0 + P d ȳ P 1 = =.380 Posterior: µ N(.380, ). That is µ N(.380, 0.864). 95% hpd interval:.380 ± That is (b) We have d 0 = d 1 = d 0 + n = 3 v 0 = 10 c 0 = 0.1 c 1 = c 0 + n = 30.1 m 0 = < µ < 4.16 ȳ =.4 m 1 = c 0m 0 + nȳ = =.39 c 0 + n 30.1 ( s n = 1 n yi 1 n ) y n i n = (ȳ m 0 ) = (.4 0) = 5.76 r = (ȳ m 0 ) + s n = v d = c 0r + ns n c 0 + n v 1 = d 0v 0 + nv d d 0 + n = = Marginal posterior distribution for τ: d 1 v 1 τ = τ χ 3. Marginal posterior distribution for µ: µ m 1 µ.39 = t 3 v1 /c / % hpd interval: v1 m 1 ±.037. c 1 10

11 That is That is ± < µ < Data: n = 15 y = 84 y = 6518 ȳ = s n = 1 } { = = Calculate posterior: d 0 = 0.7 v 0 =.0/0.7 =.8857 c 0 = m 0 = 0 d 1 = d = 15.7 (ȳ m 0 ) = ȳ = r = (ȳ m 0 ) + s n = v d = c 0r + ns n c 0 + n = v 1 = d 0v 0 + nv d = d 0 + n c 1 = c m 1 = c 0m 0 + nȳ = 84 c 0 + n = (a) Marginal posterior distribution for τ is gamma(d 1 /, d 1 v 1 /). That is gamma(7.85, ). (Alternatively d 1 v 1 τ χ d 1. That is τ χ 15.7). (b) Marginal posterior for µ: µ m 1 v1 /c 1 = µ t 15.7 (c) We can use R for the critical points of t 15.7 : ±.13 qt(0.975,15.7) 95% interval: ± That is 3.61 < µ < (d) Comment, e.g., since zero is well outside the 95% interval it seems clear that the drug reduces the blood pressure. 11

12 5. (a) The likelihood is n L = ρ t i exp[ (ρt i ) ] ( n ) = n ρ n t i exp[ ρ n t i ] The log likelihood is n l = n log + n log ρ + log(t i ) ρ n t i. So l ρ = n n ρ ρ and, setting this equal to zero at the mode ˆρ, we find The second derivative is n = ˆρ n ˆρ = ˆρ = t i n t i n t = i so the posterior variance is approximately t i = l n ρ = n ρ 1 (n/ˆρ + t i ) = 1 ( t i + t i ) = Our 95% hpd interval is therefore ± That is t i < ρ < (b) The prior density is proportional to ρ 1 e 100ρ so the log prior density is log ρ 100ρ plus a constant. The log posterior is therefore g(ρ) plus a constant where So g(ρ) = (n + 1) log ρ 100ρ ρ g ρ = n + 1 ρ 100 ρ Setting this equal to zero at the mode ˆρ we find n n t i. n t i ˆρ + 100ˆρ (n + 1) = 0. t i. 1

13 This quadratic equation has two solutions but one is negative and ρ must be positive so The second derivative is ˆρ = ( t i )(n + 1) 4 t = i g ρ = n + 1 ρ so the posterior variance is approximately n 1 [(n + 1/)/ˆρ + t i ] = Our 95% hpd interval is therefore ± That is t i < ρ < So the prior makes no noticeable difference in this case. 6. (a) λ gamma(5, 1) so λ gamma(5, 1/), i.e. gamma(10/, 1/), i.e. χ 10. From tables, 95% interval, 3.47 < λ < That is < λ < 10.4 (b) Prior density prop. to λ 5 1 e λ. Likelihood 45 e λ λ xi L = x i! = e 45λ λ xi xi! e 45λ λ 18. Posterior density prop. to λ e 46λ. This is a gamma(187, 46) distribution. (c) Posterior mean: Posterior variance: = = Posterior sd: = % interval ± That is (d) Joint prob. of λ, X = m: Γ(187) λ187 1 e 46λ λm e λ = m! Γ(187) < λ < Γ(187 + m) m m! m Γ(187 + m) λ187+m 1 e 47λ 13

14 Integrate out λ: Pr(X = m) = = = Γ(187 + m) m Γ(187)m! ( ) 187 ( (186 + m)! !m! ( m m ) m ) ( ) 187 ( ) m (e) Joint probability (density) of θ, X = m: 0.05e 0.05θ θm e θ m! = 0.05 m! Γ(1 + m) 1.05 m m+1 Γ(1 + m) θm+1 1 e 1.05θ Integrate out θ : Log posterior probs: Ordinary : Pr(X = m) = 0.05 Γ(1 + m) 1.05 m+1 = m! ( ) ( ) m log(p 1 ) = log[γ( )] log[γ(187)] log[γ(11)] log(46/47) + 10 log(1/47) = log[γ(197)] log[γ(187)] log(γ(11)] log(46) 197 log(47) = > lgamma(197) - lgamma(187) - lgamma(11) + 187*log(46) - 197*log(47) [1] Type : log(p ) = log(0.05/1.05) + 10 log(1/1.05) = log(0.05) 11 log(1.05) = Hence the predictive probabilities are as follows. Ordinary : P 1 = exp( ) = Type : P = exp( ) = Hence the posterior probability that this is an ordinary customer is =

15 9. Prior: ( ) 4 τ gamma, 18 so d 0 = 4, d 0 v 0 = 18, v 0 = 4.5. µ τ N(500, (0.005τ) 1 ) so m 0 = 500, c 0 = Data: y = 9857, n = 0, ȳ = = y = , s n = 1 n { y nȳ } = =.75 Posterior: c 1 = c 0 + n = m 1 = c 0m 0 + nȳ = c 0 + n d 1 = d 0 + n = 4 r = (ȳ m 0 ) + s n = v d = c 0r + ns n c 0 + n v 1 = d 0v 0 + nv d d 0 + n =.403 = (a) ( marks) µ /0.005 t 4 Pr(µ < 495) = Pr (Eg. use R: pt(.1880,4) ). ( ) µ < / /0.005 = Pr(t 4 <.1990) = (b) (3 marks) c p = c 1 c = = Y /0.954 t 4 Pr(Y < 500) = Pr (Eg. use R: pt(1.5886,4) ). ( ) Y < / /0.954 = Pr(t 4 < ) = (3 marks) 15

16 10. (a) Likelihood: L = 8 e λi λ yi i y i! Log likelihood: l = λ i + y i log λ i log(y i!) = λ i + y i (α + βt i ) log(y i!) Derivatives: λ i α = α eα+βti = λ i λ i = β β eα+βti = λ i t i l α = λ i α + y i = λ i + y i = (λ i y i ) l β = λ i β + y i t i = λ i t i + y i t i = t i (λ i y i ) At the maximum Hence ˆα and ˆβ satisfy the given equations. Calculations in R: > y<-c(10,13,4,17,0,,0,3) > t<-seq(3,4,3) > lambda<-exp( *t) > sum(lambda-y) [1] > sum(t*(lambda-y)) [1] l α = l β = 0. These values seem close to zero but let us try a small change to the parameter values: > lambda<-exp( *t) > sum(lambda-y) [1] > sum(t*(lambda-y)) [1] The results are now much further than zero suggesting that the given values are very close to the solutions. (b) Second derivatives: (5 marks) 16

17 l α = λ i α = λ i l β = λ i t i α = t i λ i l α β = λ i β = t i λ i Variance matrix: V = ( l α l α β l α β l β ) Numerically using R: > lambda<-exp( *t) > d<-matrix(nrow=,ncol=) > d[1,1]<- -sum(lambda) > d[1,]<- - sum(t*lambda) > d[,1]<- - sum(t*lambda) > d[,]<- - sum((t^)*lambda) > V<- - solve(d) > V [,1] [,] [1,] [,] The mean of α + 4β is = The variance is ( ) = Alternative matrix-based calculation in R: > dim(m)<-c(1,) > v<-m%*%v%*%t(m) > v [,1] [1,] The approximate 95% interval is ± That is.9139 < α + 4β < (5 marks) (c) The interval for λ 4 is e.9139 < e α+4β < e That is < λ 4 < ( marks) 17

MAS3301 Bayesian Statistics

MAS3301 Bayesian Statistics MAS331 Bayesian Statistics M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 28-9 1 9 Conjugate Priors II: More uses of the beta distribution 9.1 Geometric observations 9.1.1

More information

MAS3301 Bayesian Statistics

MAS3301 Bayesian Statistics MAS3301 Bayesian Statistics M. Farrow School of Mathematics and Statistics Newcastle University Semester, 008-9 1 13 Sequential updating 13.1 Theory We have seen how we can change our beliefs about an

More information

MAS3301 Bayesian Statistics

MAS3301 Bayesian Statistics MAS3301 Bayesian Statistics M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2008-9 1 15 Inference for Normal Distributions II 15.1 Student s t-distribution When we look

More information

MAS3301 Bayesian Statistics

MAS3301 Bayesian Statistics MAS3301 Bayesian Statistics M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2008-9 1 11 Conjugate Priors IV: The Dirichlet distribution and multinomial observations 11.1

More information

MAS3301 Bayesian Statistics Problems 2 and Solutions

MAS3301 Bayesian Statistics Problems 2 and Solutions MAS33 Bayesian Statistics Problems Solutions Semester 8-9 Problems Useful integrals: In solving these problems you might find the following useful Gamma functions: Let a b be positive Then where Γ(a) If

More information

Conjugate Priors for Normal Data

Conjugate Priors for Normal Data Conjugate Priors for Normal Data September 23, 2009 Hoff Chapter 5 Conjugate Priors for Normal Data p.1/22 Normal Model IID observations Y = (Y 1,Y 2,...Y n ) Y i µ,σ 2 N(µ,σ 2 ) unknown parameters µ and

More information

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1 Chapter 4 HOMEWORK ASSIGNMENTS These homeworks may be modified as the semester progresses. It is your responsibility to keep up to date with the correctly assigned homeworks. There may be some errors in

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

This midterm covers Chapters 6 and 7 in WMS (and the notes). The following problems are stratified by chapter.

This midterm covers Chapters 6 and 7 in WMS (and the notes). The following problems are stratified by chapter. This midterm covers Chapters 6 and 7 in WMS (and the notes). The following problems are stratified by chapter. Chapter 6 Problems 1. Suppose that Y U(0, 2) so that the probability density function (pdf)

More information

Chapter 5. Bayesian Statistics

Chapter 5. Bayesian Statistics Chapter 5. Bayesian Statistics Principles of Bayesian Statistics Anything unknown is given a probability distribution, representing degrees of belief [subjective probability]. Degrees of belief [subjective

More information

Bayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017

Bayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017 Bayesian inference Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 10, 2017 1 / 22 Outline for today A genetic example Bayes theorem Examples Priors Posterior summaries

More information

Bayesian Inference in a Normal Population

Bayesian Inference in a Normal Population Bayesian Inference in a Normal Population September 17, 2008 Gill Chapter 3. Sections 1-4, 7-8 Bayesian Inference in a Normal Population p.1/18 Normal Model IID observations Y = (Y 1,Y 2,...Y n ) Y i N(µ,σ

More information

Bayesian Inference. Chapter 2: Conjugate models

Bayesian Inference. Chapter 2: Conjugate models Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in

More information

Topic 12 Overview of Estimation

Topic 12 Overview of Estimation Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the

More information

Module 11: Linear Regression. Rebecca C. Steorts

Module 11: Linear Regression. Rebecca C. Steorts Module 11: Linear Regression Rebecca C. Steorts Announcements Today is the last class Homework 7 has been extended to Thursday, April 20, 11 PM. There will be no lab tomorrow. There will be office hours

More information

Problem Selected Scores

Problem Selected Scores Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected

More information

Math 180B Problem Set 3

Math 180B Problem Set 3 Math 180B Problem Set 3 Problem 1. (Exercise 3.1.2) Solution. By the definition of conditional probabilities we have Pr{X 2 = 1, X 3 = 1 X 1 = 0} = Pr{X 3 = 1 X 2 = 1, X 1 = 0} Pr{X 2 = 1 X 1 = 0} = P

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes: Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation

More information

MAS8303 Modern Bayesian Inference Part 2

MAS8303 Modern Bayesian Inference Part 2 MAS8303 Modern Bayesian Inference Part 2 M. Farrow School of Mathematics and Statistics Newcastle University Semester 1, 2012-13 2 Chapter 0 Inference for More Than One Unknown 0.1 More than one unknown

More information

Introduction to Machine Learning. Lecture 2

Introduction to Machine Learning. Lecture 2 Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for

More information

1 Introduction. P (n = 1 red ball drawn) =

1 Introduction. P (n = 1 red ball drawn) = Introduction Exercises and outline solutions. Y has a pack of 4 cards (Ace and Queen of clubs, Ace and Queen of Hearts) from which he deals a random of selection 2 to player X. What is the probability

More information

MATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours

MATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours MATH2750 This question paper consists of 8 printed pages, each of which is identified by the reference MATH275. All calculators must carry an approval sticker issued by the School of Mathematics. c UNIVERSITY

More information

Part 3: Parametric Models

Part 3: Parametric Models Part 3: Parametric Models Matthew Sperrin and Juhyun Park August 19, 2008 1 Introduction There are three main objectives to this section: 1. To introduce the concepts of probability and random variables.

More information

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Matthew S. Johnson New York ASA Chapter Workshop CUNY Graduate Center New York, NY hspace1in December 17, 2009 December

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

Introduction to Bayesian Methods

Introduction to Bayesian Methods Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative

More information

Bayesian statistics, simulation and software

Bayesian statistics, simulation and software Module 4: Normal model, improper and conjugate priors Department of Mathematical Sciences Aalborg University 1/25 Another example: normal sample with known precision Heights of some Copenhageners in 1995:

More information

Approximating models. Nancy Reid, University of Toronto. Oxford, February 6.

Approximating models. Nancy Reid, University of Toronto. Oxford, February 6. Approximating models Nancy Reid, University of Toronto Oxford, February 6 www.utstat.utoronto.reid/research 1 1. Context Likelihood based inference model f(y; θ), log likelihood function l(θ; y) y = (y

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information

Predictive Distributions

Predictive Distributions Predictive Distributions October 6, 2010 Hoff Chapter 4 5 October 5, 2010 Prior Predictive Distribution Before we observe the data, what do we expect the distribution of observations to be? p(y i ) = p(y

More information

You may use your calculator and a single page of notes.

You may use your calculator and a single page of notes. LAST NAME (Please Print): KEY FIRST NAME (Please Print): HONOR PLEDGE (Please Sign): Statistics 111 Midterm 2 This is a closed book exam. You may use your calculator and a single page of notes. The room

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

Bayesian Econometrics

Bayesian Econometrics Bayesian Econometrics Christopher A. Sims Princeton University sims@princeton.edu September 20, 2016 Outline I. The difference between Bayesian and non-bayesian inference. II. Confidence sets and confidence

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS330 / MAS83 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-0 8 Parametric models 8. Introduction In the last few sections (the KM

More information

Statistics & Data Sciences: First Year Prelim Exam May 2018

Statistics & Data Sciences: First Year Prelim Exam May 2018 Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book

More information

Survival Analysis. Stat 526. April 13, 2018

Survival Analysis. Stat 526. April 13, 2018 Survival Analysis Stat 526 April 13, 2018 1 Functions of Survival Time Let T be the survival time for a subject Then P [T < 0] = 0 and T is a continuous random variable The Survival function is defined

More information

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments STAT 135 Lab 3 Asymptotic MLE and the Method of Moments Rebecca Barter February 9, 2015 Maximum likelihood estimation (a reminder) Maximum likelihood estimation Suppose that we have a sample, X 1, X 2,...,

More information

Midterm Examination. Mth 136 = Sta 114. Wednesday, 2000 March 8, 2:20 3:35 pm

Midterm Examination. Mth 136 = Sta 114. Wednesday, 2000 March 8, 2:20 3:35 pm Midterm Examination Mth 136 = Sta 114 Wednesday, 2000 March 8, 2:20 3:35 pm This is a closed-book examination so please do not refer to your notes, the text, or to any other books. You may use a two-sided

More information

Beyond GLM and likelihood

Beyond GLM and likelihood Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence

More information

Last week. posterior marginal density. exact conditional density. LTCC Likelihood Theory Week 3 November 19, /36

Last week. posterior marginal density. exact conditional density. LTCC Likelihood Theory Week 3 November 19, /36 Last week Nuisance parameters f (y; ψ, λ), l(ψ, λ) posterior marginal density π m (ψ) =. c (2π) q el P(ψ) l P ( ˆψ) j P ( ˆψ) 1/2 π(ψ, ˆλ ψ ) j λλ ( ˆψ, ˆλ) 1/2 π( ˆψ, ˆλ) j λλ (ψ, ˆλ ψ ) 1/2 l p (ψ) =

More information

Bayesian Inference in a Normal Population

Bayesian Inference in a Normal Population Bayesian Inference in a Normal Population September 20, 2007 Casella & Berger Chapter 7, Gelman, Carlin, Stern, Rubin Sec 2.6, 2.8, Chapter 3. Bayesian Inference in a Normal Population p. 1/16 Normal Model

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Interval Estimation. Chapter 9

Interval Estimation. Chapter 9 Chapter 9 Interval Estimation 9.1 Introduction Definition 9.1.1 An interval estimate of a real-values parameter θ is any pair of functions, L(x 1,..., x n ) and U(x 1,..., x n ), of a sample that satisfy

More information

Multivariate Bayesian Linear Regression MLAI Lecture 11

Multivariate Bayesian Linear Regression MLAI Lecture 11 Multivariate Bayesian Linear Regression MLAI Lecture 11 Neil D. Lawrence Department of Computer Science Sheffield University 21st October 2012 Outline Univariate Bayesian Linear Regression Multivariate

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a

More information

Machine Learning - MT & 5. Basis Expansion, Regularization, Validation

Machine Learning - MT & 5. Basis Expansion, Regularization, Validation Machine Learning - MT 2016 4 & 5. Basis Expansion, Regularization, Validation Varun Kanade University of Oxford October 19 & 24, 2016 Outline Basis function expansion to capture non-linear relationships

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

Homework 1 Solutions Probability, Maximum Likelihood Estimation (MLE), Bayes Rule, knn

Homework 1 Solutions Probability, Maximum Likelihood Estimation (MLE), Bayes Rule, knn Homework 1 Solutions Probability, Maximum Likelihood Estimation (MLE), Bayes Rule, knn CMU 10-701: Machine Learning (Fall 2016) https://piazza.com/class/is95mzbrvpn63d OUT: September 13th DUE: September

More information

Conjugate Priors for Normal Data

Conjugate Priors for Normal Data Conjugate Priors for Normal Data September 22, 2008 Gill Chapter 3. Sections 4, 7-8 Conjugate Priors for Normal Data p.1/17 Normal Model IID observations Y = (Y 1,Y 2,...Y n ) Y i N(µ,σ 2 ) unknown parameters

More information

Part IB. Statistics. Year

Part IB. Statistics. Year Part IB Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2018 42 Paper 1, Section I 7H X 1, X 2,..., X n form a random sample from a distribution whose probability

More information

Problem 1 (20) Log-normal. f(x) Cauchy

Problem 1 (20) Log-normal. f(x) Cauchy ORF 245. Rigollet Date: 11/21/2008 Problem 1 (20) f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 4 2 0 2 4 Normal (with mean -1) 4 2 0 2 4 Negative-exponential x x f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.5

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide

More information

Section 4.6 Simple Linear Regression

Section 4.6 Simple Linear Regression Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval

More information

Contents 1. Contents

Contents 1. Contents Contents 1 Contents 6 Distributions of Functions of Random Variables 2 6.1 Transformation of Discrete r.v.s............. 3 6.2 Method of Distribution Functions............. 6 6.3 Method of Transformations................

More information

Statistical Theory MT 2007 Problems 4: Solution sketches

Statistical Theory MT 2007 Problems 4: Solution sketches Statistical Theory MT 007 Problems 4: Solution sketches 1. Consider a 1-parameter exponential family model with density f(x θ) = f(x)g(θ)exp{cφ(θ)h(x)}, x X. Suppose that the prior distribution has the

More information

Bayesian course - problem set 5 (lecture 6)

Bayesian course - problem set 5 (lecture 6) Bayesian course - problem set 5 (lecture 6) Ben Lambert November 30, 2016 1 Stan entry level: discoveries data The file prob5 discoveries.csv contains data on the numbers of great inventions and scientific

More information

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33 Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

The binomial model. Assume a uniform prior distribution on p(θ). Write the pdf for this distribution.

The binomial model. Assume a uniform prior distribution on p(θ). Write the pdf for this distribution. The binomial model Example. After suspicious performance in the weekly soccer match, 37 mathematical sciences students, staff, and faculty were tested for the use of performance enhancing analytics. Let

More information

1 Basic continuous random variable problems

1 Basic continuous random variable problems Name M362K Final Here are problems concerning material from Chapters 5 and 6. To review the other chapters, look over previous practice sheets for the two exams, previous quizzes, previous homeworks and

More information

Stats 579 Intermediate Bayesian Modeling. Assignment # 2 Solutions

Stats 579 Intermediate Bayesian Modeling. Assignment # 2 Solutions Stats 579 Intermediate Bayesian Modeling Assignment # 2 Solutions 1. Let w Gy) with y a vector having density fy θ) and G having a differentiable inverse function. Find the density of w in general and

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

STAT FINAL EXAM

STAT FINAL EXAM STAT101 2013 FINAL EXAM This exam is 2 hours long. It is closed book but you can use an A-4 size cheat sheet. There are 10 questions. Questions are not of equal weight. You may need a calculator for some

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm 1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable

More information

HPD Intervals / Regions

HPD Intervals / Regions HPD Intervals / Regions The HPD region will be an interval when the posterior is unimodal. If the posterior is multimodal, the HPD region might be a discontiguous set. Picture: The set {θ : θ (1.5, 3.9)

More information

Link lecture - Lagrange Multipliers

Link lecture - Lagrange Multipliers Link lecture - Lagrange Multipliers Lagrange multipliers provide a method for finding a stationary point of a function, say f(x, y) when the variables are subject to constraints, say of the form g(x, y)

More information

2. I will provide two examples to contrast the two schools. Hopefully, in the end you will fall in love with the Bayesian approach.

2. I will provide two examples to contrast the two schools. Hopefully, in the end you will fall in love with the Bayesian approach. Bayesian Statistics 1. In the new era of big data, machine learning and artificial intelligence, it is important for students to know the vocabulary of Bayesian statistics, which has been competing with

More information

Bayesian Inference: Posterior Intervals

Bayesian Inference: Posterior Intervals Bayesian Inference: Posterior Intervals Simple values like the posterior mean E[θ X] and posterior variance var[θ X] can be useful in learning about θ. Quantiles of π(θ X) (especially the posterior median)

More information

557: MATHEMATICAL STATISTICS II HYPOTHESIS TESTING: EXAMPLES

557: MATHEMATICAL STATISTICS II HYPOTHESIS TESTING: EXAMPLES 557: MATHEMATICAL STATISTICS II HYPOTHESIS TESTING: EXAMPLES Example Suppose that X,..., X n N, ). To test H 0 : 0 H : the most powerful test at level α is based on the statistic λx) f π) X x ) n/ exp

More information

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions. Large Sample Theory Study approximate behaviour of ˆθ by studying the function U. Notice U is sum of independent random variables. Theorem: If Y 1, Y 2,... are iid with mean µ then Yi n µ Called law of

More information

Bayesian Inference for Regression Parameters

Bayesian Inference for Regression Parameters Bayesian Inference for Regression Parameters 1 Bayesian inference for simple linear regression parameters follows the usual pattern for all Bayesian analyses: 1. Form a prior distribution over all unknown

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

The logistic regression model is thus a glm-model with canonical link function so that the log-odds equals the linear predictor, that is

The logistic regression model is thus a glm-model with canonical link function so that the log-odds equals the linear predictor, that is Example The logistic regression model is thus a glm-model with canonical link function so that the log-odds equals the linear predictor, that is log p 1 p = β 0 + β 1 f 1 (y 1 ) +... + β d f d (y d ).

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

From Model to Log Likelihood

From Model to Log Likelihood From Model to Log Likelihood Stephen Pettigrew February 18, 2015 Stephen Pettigrew From Model to Log Likelihood February 18, 2015 1 / 38 Outline 1 Big picture 2 Defining our model 3 Probability statements

More information

Statistics/Mathematics 310 Larget April 14, Solution to Assignment #9

Statistics/Mathematics 310 Larget April 14, Solution to Assignment #9 Solution to Assignment #9 1. A sample s = (753.0, 1396.9, 1528.2, 1646.9, 717.6, 446.5, 848.0, 1222.0, 748.5, 610.1) is modeled as an i.i.d sample from a Gamma(α, λ) distribution. You may think of this

More information

Choosing among models

Choosing among models Eco 515 Fall 2014 Chris Sims Choosing among models September 18, 2014 c 2014 by Christopher A. Sims. This document is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Module 22: Bayesian Methods Lecture 9 A: Default prior selection

Module 22: Bayesian Methods Lecture 9 A: Default prior selection Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington Outline Jeffreys prior Unit information priors Empirical

More information

Regression. ECO 312 Fall 2013 Chris Sims. January 12, 2014

Regression. ECO 312 Fall 2013 Chris Sims. January 12, 2014 ECO 312 Fall 2013 Chris Sims Regression January 12, 2014 c 2014 by Christopher A. Sims. This document is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License What

More information

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing

More information

Inference for a Population Proportion

Inference for a Population Proportion Al Nosedal. University of Toronto. November 11, 2015 Statistical inference is drawing conclusions about an entire population based on data in a sample drawn from that population. From both frequentist

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

Weakness of Beta priors (or conjugate priors in general) They can only represent a limited range of prior beliefs. For example... There are no bimodal beta distributions (except when the modes are at 0

More information

Prior Choice, Summarizing the Posterior

Prior Choice, Summarizing the Posterior Prior Choice, Summarizing the Posterior Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Informative Priors Binomial Model: y π Bin(n, π) π is the success probability. Need prior p(π) Bayes

More information

Bayesian Inference. Chapter 9. Linear models and regression

Bayesian Inference. Chapter 9. Linear models and regression Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering

More information

0, otherwise. U = Y 1 Y 2 Hint: Use either the method of distribution functions or a bivariate transformation. (b) Find E(U).

0, otherwise. U = Y 1 Y 2 Hint: Use either the method of distribution functions or a bivariate transformation. (b) Find E(U). 1. Suppose Y U(0, 2) so that the probability density function (pdf) of Y is 1 2, 0 < y < 2 (a) Find the pdf of U = Y 4 + 1. Make sure to note the support. (c) Suppose Y 1, Y 2,..., Y n is an iid sample

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Advanced Quantitative Methods: maximum likelihood

Advanced Quantitative Methods: maximum likelihood Advanced Quantitative Methods: Maximum Likelihood University College Dublin 4 March 2014 1 2 3 4 5 6 Outline 1 2 3 4 5 6 of straight lines y = 1 2 x + 2 dy dx = 1 2 of curves y = x 2 4x + 5 of curves y

More information

Bayesian statistics, simulation and software

Bayesian statistics, simulation and software Module 10: Bayesian prediction and model checking Department of Mathematical Sciences Aalborg University 1/15 Prior predictions Suppose we want to predict future data x without observing any data x. Assume:

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

Point Estimation and Confidence Interval

Point Estimation and Confidence Interval Chapter 8 Point Estimation and Confidence Interval 8.1 Point estimator The purpose of point estimation is to use a function of the sample data to estimate the unknown parameter. Definition 8.1 A parameter

More information