A tutorial on. Richard Boys
|
|
- Tobias Weaver
- 6 years ago
- Views:
Transcription
1 A tutorial on Bayesian inference for the normal linear model Richard Boys Statistics Research Group, Newcastle University, UK
2 Motivation Much of Bayesian inference nowadays analyses complex hierarchical models using computer-intensive methods: MCMC, pmcmc, ABC, HMC,... Not that long ago, most analyses used a conjugate analysis of say the normal linear model a non-conjugate analysis fitted via techniques such a Gaussian quadrature or Laplace s method to evaluate the required integrals This tutorial will give an overview of the basics underpinning an analysis of data assuming a normal linear model and a conjugate prior
3 Normal random sample Example The 18th century physicist Henry Cavendish made 23 experimental determinations of the earth s density, and these data (in g/cm 3 ) are with sufficient statistics n = 23, x = , s = Normal Q Q Plot Sample Quantiles Theoretical Quantiles
4 Conjugate analysis Data: X i µ, τ N(µ, 1/τ), i = 1, 2,..., n (indep) Likelihood function: π(x µ, τ) = Conjugate prior: Take Write ( τ ) n/2 [ exp nτ { s 2 + ( x µ) 2}] 2π 2 π(µ, τ) τ exp [ τ { + ( µ) 2}] µ τ N = π(µ τ)π(τ) ( b, 1 ), τ Ga(g, h) cτ ( ) µ NGa(b, c, g, h). Prior density is τ π(µ, τ) τ g 1 2 exp { τ 2 [ c(µ b) 2 + 2h ]}, µ R, τ > 0
5 Question What s the posterior distribution for Hint: c(µ b) 2 + n( x µ) 2 = (c + n) ( ) µ? τ { µ ( )} cb + n x 2 nc( x b)2 + c + n c + n
6 Hint: { ( )} cb + n x 2 c(µ b) 2 + n( x µ) 2 = (c + n) µ + c + n nc( x b)2 c + n Using Bayes Theorem, the posterior density is, for µ R, τ > 0 π(µ, τ x) π(µ, τ) π(x µ, τ) { τ g+ n exp τ [ c(µ b) 2 + n( x µ) 2 + 2h + ns 2]} { 2 [ τ g+ n exp τ { ( )} cb + n x 2 nc( x b)2 (c + n) µ + 2 c + n c + n { τ G 1 2 exp τ [ C(µ B) 2 + 2H ]} 2 + 2h + ns 2 where B = bc+n x c+n, C = c + n, G = g + n 2, H = h + cn( x b)2 2(c+n) + ns2 2 Therefore ( ) µ x NGa(B, C, G, H) τ
7 Posterior analysis ( ) µ Posterior x NGa(B, C, G, H) τ Clearly τ Ga(G, H) Question What s the marginal posterior for µ? Hints: 1 Posterior density: π(µ, τ x) τ G 1 2 exp { τ θ a 1 e bθ dθ = Γ(a)/ b a 3 If Y t a (b, c) then it has density f(y a, b, c) {1 + [ C(µ B) 2 + 2H ]} } a+1 (y b)2 2, y R ac
8 Hints: 0 θ a 1 e bθ dθ = Γ(a) b a and f(y a, b, c) {1 + } a+1 (y b)2 2 ac The (marginal) posterior density for µ is, for µ R π(µ x) = 0 π(µ, τ x) dτ { τ G exp τ [ C(µ B) 2 + 2H ]} dτ 0 2 Γ ( ) G using [{C(µ B) 2 + 2H}/2}] G } 2G+1 2 C(µ B)2 {1 + 2H θ a 1 e bθ dθ = Γ(a) b a Therefore µ x t 2G (B, ) H GC
9 Some distribution theory Generalised t distribution: Y t a (b, c) Density is f(y a, b, c) = Γ ( ) a+1 2 ( acπ Γ a {1 + 2) Parameters: a > 0, b R, c > 0 } a+1 (y b)2 2, y R ac Generalisation of the standard t-distribution since (Y b)/ c t a E(Y ) = Mode(Y ) = b and V ar(y ) = ac a 2, if a 2 t a (0, 1) t a lim a t a (b, c) = N(b, c)
10 Inverse Chi distribution: Y Inv-Chi(a, b) Density is f(y a, b) = 2ba y 2a 1 e b/y2, y > 0 Γ(a) Parameters: a > 0, b > 0 bγ(a 1/2) E(Y ) = Γ(a) V ar(y ) = b a 1 E(Y )2, if a 1 The name of the distribution comes from the fact that 1/Y 2 Ga(a, b) χ 2 2a /(2b)
11 Summary of the posterior distribution Posterior ( ) µ x NGa(B, C, G, H) τ Marginal distributions: ( ) µ x t 2G B, H GC τ x Ga(G, H) σ x Inv-Chi(G, H)
12 An Example The 18th century physicist Henry Cavendish made 23 experimental determinations of the earth s density, and these data (in g/cm 3 ) are with sufficient statistics n = 23, x = , s = Data model X i µ, τ indep N(µ, 1/τ), i = 1, 2,..., 23 ( ) Prior: µ NGa(b = 5.41, c = 0.25, g = 2.5, h = 0.1) τ Posterior: ( ) µ x NGa(B = , C = 23.25, G = 14, H = ) τ µ x t 28 (5.4840, ), τ x Ga(14, ), σ x Inv-Chi(14, )
13 Comparison of priors and posteriors (Wikipedia: µ = g/cm 3 ) density µ density density τ σ
14 Comparison of prior and posteriors Contours of posterior density: τ µ
15 Confidence intervals and regions Point estimates Could use posterior mean, mode,... Not really worth having without some idea of uncertainty Interval estimates Univariate parameters Confidence intervals, credible intervals, Bayesian confidence intervals Highest density intervals (HDI), equi-tailed intervals Symmetric posteriors: HDI = equi-tailed interval Here µ x t 2G {B, H/(GC)} is symmetric HDI easy τ x Ga(G, H) is skewed HDI non-trivial but equi-tailed interval is easy Ditto for σ x Inv-Chi(G, H)
16 Results from this data analysis... 95% confidence intervals Prior Posterior µ: (4.38, 6.44) (5.40, 5.56) τ: (1.48, 55.96) (14.02, 42.25) HDI (4.16, 64.16) (15.07, 43.76) σ: (0.11, 0.42) (0.15, 0.26) HDI (0.12, 0.49) (0.15, 0.26)
17 Confidence regions So far have looked at univariate HDIs Can be useful to also look at (joint) confidence regions Question What is the 100(1 α)% HDI region for (µ, τ) T? Hint: log π(µ, τ x) = τ 2 { ( C(µ B) 2 G 1 ) } log τ + 2H + const 2
18 Hint: log π(µ, τ x) = τ 2 { C(µ B) 2 ( G 1 2 ) log τ + 2H } + const The 100(1 α)% HDI region for (µ, τ) T is {( ) } µ : π(µ, τ x) > k τ α {( ) } µ = : log π(µ, τ x) < k α τ {( ) { ( µ τ = : C(µ B) 2 G 1 ) } } log τ + 2H < k α τ 2 2 How to determine k α?
19 Need the posterior distribution function F ( ) of Y (µ, τ) = τ { ( C(µ B) 2 G 1 ) } log τ + 2H 2 2 and take k α s.t. F (k α) = 1 α Not a standard distribution build up F via simulation
20 Cavendish example τ µ Figure : 95%, 90% and 80% prior (dashed) and posterior (solid) confidence regions for (µ, τ) T ; 95% (outer), 80% (inner).
21 Focusing on central part of plot... τ Figure : 95%, 90% and 80% prior (dashed) and posterior (solid) confidence regions for (µ, τ) T ; 95% (outer), 80% (inner). µ
22 Predictive distribution The predictive density of a new observation y is f(y x) = f(y µ, τ) π(µ, τ x) dµ dτ As this is a conjugate analysis, we can determine the predictive density using Candidate s formula π(θ x, y) = π(θ)f(x, y θ) f(x, y) = π(θ)f(x θ)f(y θ) f(x)f(y x) π(θ x) f(y θ) = f(y x) f(y x) = f(y θ)π(θ x) π(θ x, y) since X and Y are indep given θ But, for this model, there is a more straightforward way...
23 Y µ, τ N ( µ, 1 τ ), (µ, τ) T x NGa(B, C, G, H) Y = µ + ε, ε τ N ( 0, 1 τ ), µ x, τ N ( B, 1 Cτ ), τ x Ga(G, H) Y x, τ N Already seen µ τ, x N ( B, 1 Cτ Y x t 2G ( ( B, 1 τ + 1 ) ( N B, C + 1 ), τ x Ga(G, H) Cτ Cτ B, H(C+1) GC ) ( ), τ x Ga(G, H) µ x t2g B, H GC )
24 Predictive distribution of summary statistics Future random sample y 1, y 2,..., y m Sufficient statistics: sample mean Ȳm and (biased) sample variance V m = m i=1 (Y i Ȳ )2 /m Question 1 What is the predictive distribution of Ȳm? Hint: Y µ, τ N ( µ, 1 τ ), (µ, τ) T x NGa(B, C, G, H) Y = µ + ε, ε τ N ( ) ( ) 0, 1 τ, µ x, τ N B, 1 Cτ, τ x Ga(G, H) ( Y x, τ N B, 1 τ + 1 ) ( N B, C + 1 ), τ x Ga(G, H) Cτ Cτ ( ) Y x t 2G B, H(C+1) GC 2 What is the predictive distribution of V m?
25 Predictive distribution of Ȳm Ȳ m µ, τ N ( µ, 1 mτ ), (µ, τ) T x NGa(B, C, G, H) Ȳ m = µ+ε, ε τ N ( ) ( ) 1 0, mτ, µ x, τ N B, 1 Cτ, τ x Ga(G, H) Ȳ m x, τ N Ȳ m x t 2G ( ( B, 1 mτ + 1 ) ( N B, C + m ), τ x Ga(G, H) Cτ Cmτ B, H(C+m) GCm ) Note that Ȳm x D µ x as m
26 Predictive distribution of V m = m i=1 (Y i Y ) 2 /m In normal random samples (m 1)S 2 u σ 2 χ 2 m 1 mv m τ τ χ 2 m 1 V m τ Ga Predictive density for V m is f(v x) = f(v τ) π(τ x) dτ = = ( m 2 ) m 1 2 H G v m Γ( m 1 1 B( m 1 2, G) V m x 2 )Γ(G) ( m 2H ) m 1 2 (m 1)H mg 0 v m 1 ( m 1 τ m 1 2 +G 1 e (mv/2+h)τ dτ ( mv 2H F m 1,2G 2 ) ( m 1 2 +G), mτ ) 2
27 What happens as m? Actually look at the predictive distribution of 1/V m as m As F ν1,ν 2 χ2 ν 1 /ν 1 χ 2 Ga(ν 1/2, ν 1 /2) ν 2 /ν 2 Ga(ν 2 /2, ν 2 /2) (m 1)H V m x F m 1,2G mg = 1 x V m mg (m 1)H (m 1)H mg Ga{(m 1)/2, (m 1)/2} Ga(G, G) Ga(G, G) Ga{(m 1)/2, (m 1)/2} Now Ga{(m 1)/2, (m 1)/2} has mean 1 and variance 2/(m 1) So as m 1 V m x D G H Ga(G, G) 1 Ga(G, H) τ x
28 Summary Data: normal random sample X i µ, τ indep Prior: (µ, τ) T NGa(b, c, g, h) is conjugate N(µ, 1/τ), i = 1, 2,..., n Posterior: (µ, τ) T x NGa(B, C, G, H) Marginal posteriors: µ x t 2G {B, H/(GC)}, σ x Inv-Chi(G, H) τ x Ga(G, H), Marginal HDIs or equi-tailed CIs fairly easy to calculate Joint HDI regions a little trickier (but solved by simulation) Predictive: Y x t 2G {B, H(C + 1)/(GC)}
29 Inference in a normal linear model Introduction Data (y i, x i1,..., x ip ), i = 1, 2,..., n Multiple linear regression model p Y i = β j x ij + ε i, j=1 ε i τ indep N(0, 1/τ) In matrix notation: Y = Xβ + ε Y 1 x 11 x 12 x 1p β 1 ε 1 Y 2 x Y =., X = 21 x 22 x 2p β..., β = 2 ε., ε = 2.. Y n x n1 x n1 x np β p ε n ε i indep N(0, 1/τ) = ε N n (0, τ 1 I n ) Therefore Y X, β, τ N n (Xβ, τ 1 I n )
30 Conjugate analysis Data: Y X, β, τ N n (Xβ, τ 1 I n ) Likelihood function: f(y X, β, τ) = (2π) n/2 τ n/2 exp { τ [ ]} ns 2 + (β ˆβ) T X T X(β ˆβ) 2 ˆβ = (X T X) 1 X T y is the least squares (or max lik) estimate of β s 2 = (y X ˆβ) T (y X ˆβ)/n is the r.m.s. Conjugate prior: π(β, τ) τ exp [ τ { + (β ) T (β ) }] = π(β τ)π(τ) { Take β τ N p b, (cτ) 1 }, τ Ga(g, h) ( ) β Write N τ p Ga(b, c, g, h). Prior density is, for β R p, τ > 0 { π(β, τ) τ g+ p 2 1 exp τ [ (β b) T c (β b) + 2h ]} 2
31 Question What s the posterior distribution for Hint: ( ) β? τ (β b) T c (β b) + (β ˆβ) T X T X(β ˆβ) = (β B) T (c + X T X)(β B) B T (c + X T X)B + b T c b + ˆβ T X T X ˆβ where B = (c + X T X) 1 (cb + X T X ˆβ)
32 Hint: (β b) T c (β b) + (β ˆβ) T X T X(β ˆβ) = (β B) T (c + X T X)(β B) B T (c + X T X)B + b T c b + ˆβ T X T X ˆβ Using Bayes Theorem, the posterior density is, for β R p, τ > 0 π(β, τ D) π(β, τ) f(y X, β, τ) { τ g+ p 2 1 exp τ [ (β b) T c (β b) + 2h ]} { 2 τ n 2 exp τ [ ]} ns 2 + (β ˆβ) T X T X(β ˆβ) { 2 τ g+ n 2 + p 2 1 exp τ [ (β B) T (c + X T X)(β B) + 2h + ns 2 2 ]} B T (c + X T X)B + b T c b + ˆβ T X T X ˆβ { τ G+ p 2 1 exp τ [ (β B) T C (β B) + 2H ]} 2
33 { π(β, τ D) τ G+ p 2 1 exp τ [ (β B) T C (β B) + 2H ]} 2 where B = (c + X T X) 1 (cb + X T X ˆβ), C = c + X T X, G = g + n 2, H = h + 1 { ns 2 B T (c + X T X)B + b T c b + ˆβ T X T X ˆβ } 2 Therefore ( ) β D N τ p Ga(B, C, G, H)
34 Posterior analysis ( ) β Posterior D N τ p Ga(B, C, G, H) Clearly τ D Ga(G, H) Question What s the marginal posterior for β? Hints: 1 Posterior density: 2 π(β, τ D) τ G+ p 2 1 exp { τ 2 0 θ a 1 e bθ dθ = Γ(a)/ b a 3 If X t a (b, c) then it has density f(x a, b, c) [ (β B) T C (β B) + 2H ]} { 1 + (x b)t c 1 } a+p 2 (x b), x R p a
35 Hints: 0 θ a 1 e bθ dθ = Γ(a) { b a and f(x a, b, c) 1 + (x b)t c 1 } a+p (x b) 2 a The posterior density for β is, for β R p π(β D) = 0 0 π(β, τ D) dτ { τ G+ p 2 1 exp τ [ (β B) T C (β B) + 2H ]} dτ 2 ) Γ ( G + p 2 [{(β B) T C (β B) + 2H}/2}] G+ p 2 { 1 + (β } (G+ p B)T 2 ) C (β B) 2H using 0 θ a 1 e bθ dθ = Γ(a) b a Therefore β D t 2G ( B, H G C 1 )
36 Some more distribution theory p-dimensional t distribution: X t a (b, c) Density for x R p f(x a, b, c) = Γ ( a+p) 2 c 1/2 (aπ) p/2 Γ ( ) a 2 { 1 + (x b)t c 1 } a+p 2 (x b) a Parameters: a > 0, b = (b i ) R p, c = (c ij ) is a symmetric positive definite matrix Generalisation of the univariate t a (b, c) distribution
37 Properties of X t a (b, c) 1 E(X) = Mode(X) = b V ar(x) = ac/(a 2), a 2 Univariate: E(X i ) = b i V ar(x i ) = ac ii /(a 2), a 2 Corr(X i, X j ) = c ij / c ii c jj 2 A is q p, d is q 1 vector: AX + d t a (Ab + d, AcA T ) 3 Univariate marginals: X i t a (b i, c ii ) 4 The contours of its density are ellipsoids centred at x = b: 5 (X b) T c 1 (X b) p F p,a {x : (x b) T c 1 (x b) = k}
38 Summary of the posterior distribution Posterior ( ) β D N τ p Ga(B, C, G, H) Marginal distributions: β D t 2G ( B, H G C 1) β i D t 2g ( Bi, H G Cii), i = 1,..., p τ D Ga(G, H) σ D Inv-Chi(G, H)
39 Confidence intervals and regions ( Regression parameters: β i D t 2G Bi, H G Cii) 100(1 α)% HDI for β i is just an equi-tailed interval: ( ) HC ii HC B i t 2G;α/2 G, B ii i + t 2G;α/2 G where t 2G;α/2 is the upper α/2 point of the t 2G distribution Noise parameters: τ D Ga(G, H), σ D Inv-Chi(G, H) HDIs for τ or σ non-trivial Equi-tailed intervals easy
40 Questions What is the form of an HDI-type confidence region for the regression parameters? Hint: π(β D) { 1 + (β } (G+ p B)T C (β B) 2), β R p 2H Determine the 100(1 α)% HDI type confidence region Hints: 1 β D t 2G ( B, H G C 1) 2 Property 5 of the p dim t-distribution is if X t a (b, c) then (X b) T c 1 (X b) p F p,a
41 What is the form of an HDI-type confidence region for β? Hint: π(β D) { 1 + (β } (G+ p B)T 2 ) C (β B) 2H HDI: {β : π(β D) > u} = {β : (β B) T C (β B) < k} Determine the 100(1 α)% HDI type confidence region for β As X t a (b, c) (X b) T c 1 (X b) p F p,a (β B) T ( H G C 1 ) 1 (β B) p F p,2g (β B) T C (β B) ph G F p,2g the 100(1 α)% confidence region is { β : (β B) T C (β B) < ph } G F p,2g;1 α
42 Predictive distributions Want to predict the response Y when the covariate is x Model: Y = β T x + ε, where ε τ N(0, 1/τ) From posterior β τ, D N p {B, (Cτ) 1 } β T x τ, D N { B T x, x T (Cτ) 1 x } Y x, τ, D N{B T x, (x T C 1 x + 1)/τ} From posterior: τ D Ga(G, H) Already seen µ τ, x N ( B, 1 Cτ Therefore ) ( ), τ x Ga(G, H) µ x t2g B, H GC Y x, D t 2G {B T x, H(x T C 1 x + 1)/G}
43 Summary Data: Y X, β, τ N n (Xβ, τ 1 I n ) ( ) β Prior: N τ p Ga(b, c, g, h) is conjugate ( ) β Posterior: D N τ p Ga(B, C, G, H) Marginal posteriors: β D t 2G (B, HC 1 /G), β i D t 2G (B i, HC ii /G), τ D Ga(G, H), σ D Inv-Chi(G, H) Univariate HDIs or equi-tailed CIs fairly easy to calculate HDI regions for β are interior of ellipsoids Predictive: Y x, D t 2G {B T x, H(x T C 1 x + 1)/G}
44 Linear regression example Malcolm wants to be able to predict the height (Y ) of a student from their shoe size (X) He thinks that a simple linear regression Y = α + βx + ε explains the relationship between the variables Prior elicitation Malcolm decides that he will set up his prior by using information about his measurements and those of his wife His shoe size is 11 and he is 74 inches tall. Therefore he gives prior E(α + 11β) = 74 inches He realises that he is probably not exactly the average height for size 11 shoe-wearers he feels he is unlikely to be more than six inches from this conditional mean and so gives a prior SD(α + 11β) = 3 inches His wife takes a size 5 shoe and is 64 inches tall and so takes prior E(α + 5β) = 64 inches and SD(α + 5β) = 3 inches He decides that his beliefs about α + 11β and α + 5β are independent Finally, he believes that the measurement error precision τ has mean 0.5 and variance 0.125
45 Question Need to know the parameters of his congugate prior distribution (α, β, τ) T N 2 Ga(b, c, g, h) 1 How might you calculate their values from E(α + 11β) = 74 inches, SD(α + 11β) = 3 inches E(α + 5β) = 64 inches, SD(α + 5β) = 3 inches α + 11β and α + 5β are independent τ has mean 0.5 and variance Roughly what size is Corr(α, β)?
46 τ Ga(g, h) E(τ) = g/h V ar(τ) = g/h 2 Therefore, as Malcolm requires E(τ) = 0.5 and V ar(τ) = g = E(τ)2 E(τ) = 2 and h = V ar(τ) V ar(τ) = 4
47 E(α + 11β) = 74, V ar(α + 11β) = 9, E(α + 5β) = 64, V ar(α + 5β) = 9, Cov(α + 11β, α + 5β) = 0 E(α) + 11 E(β) = 74, E(α) + 5 E(β) = 64, V ar(α) V ar(β) + 22 Cov(α, β) = 9, V ar(α) + 25 V ar(β) + 10 Cov(α, β) = 9, V ar(α) + 55 V ar(β) + 16 Cov(α, β) = 0 ( ) ( ) E(α) 167/3 E(β) = = E(β) 5/3 ( ) V ar(α) Cov(α, β) V ar(β) = = Cov(α, β) V ar(β) Corr(α, β) = ( 36.5 ) = very strong negative
48 β t 2g (b, h g c 1 ) E(β) = b V ar(β) = Therefore ( ) 167/3 b = 5/3 c = 2g 2g 2 h g 1 V ar(β) 1 = 4 Malcolm s conjugate prior α β N 2 Ga τ { b = ( h g c 1 ) = h g 1 c 1 ( ) 167/3, c = 5/3 ( ) = ( ) ( ) } , g = 2, h =
49 Malcolm s data shoe height
50 Data calculations (ˆαˆβ) ( ˆβ = = ( n n i=1 x i n i=1 x n i i=1 x2 i X T X = Malcolm s posterior α { β τ D N 2Ga B = ), ns 2 = , ) = ( ) , C = ( 52 ) ( ) , G = 28, H = }
51 Marginal prior and posterior distributions density density α β density density τ σ
52 95% confidence intervals Confidence regions α: (54.42, 56.98) β: (1.45, 1.75) τ: (0.55, 1.18) HDI (0.57, 1.20) σ: (0.90, 1.31) HDI (0.91, 1.32) Here p = 2 and so the 100(1 α)% confidence region for β = (α, β) T is { β : (β B) T C (β B) < 2H } G F 2,2G;1 α
53 Prior and posterior confidence regions β Figure : 95%, 90% and 80% prior (dashed) and posterior (solid) confidence regions for β = (α, β) T ; 95% (outer), 80% (inner). α
54 Focusing on central part of plot... β Figure : 95%, 90% and 80% prior (dashed) and posterior (solid) confidence regions for β = (α, β) T ; 95% (outer), 80% (inner). α
55 One-way ANOVA example Charlotte is a vet is studying the effect of three new diets (Diets 1, 2 and 3) on the blood coagulation times (in seconds) in a set of animals Model the response Y ij of animal j on diet i as Y ij = µ i + ε ij, indep ε ij N(0, 1/τ) This can be rewritten as a normal linear model Need Charlotte s conjugate prior µ 1 µ 2 µ 3 τ N 3Ga(b, c, g, h)
56 Charlotte s prior knowledge Her previous experience with coagulation times for this sort of animal have a range of around 12 seconds Taking this as a (rough) 95% interval suggests that 4 SD(Y ij ) 12 τ 1/9 She sets the prior mean E(τ) = g/h = 1/9 She thinks it unlikely that σ will be greater than 6 seconds and so requires P r(τ < 1/36) 0.05 She decides on g = 24/9 and h = 24 as this gives P r(τ < 1/36) = 0.053
57 Charlotte s prior knowledge She has no specialist information about any of the diets she decides to treat the diets as exchangeable, so that b 1 r r b = b and c = k r 1 r b r r 1 for some choice of b R, k > 0 and r ( 1/2, 1) She decides that E(µ i ) = b = 60 seconds Corr(µ i, µ j ) = r 1 + r = 2 3 r = 0.4 SD(µ i ) 2.5 seconds k = 4.94 seconds 2
58 Charlotte s conjugate prior µ 1 { µ µ 3 N 3Ga b = 60, c = , τ g = 24/9, h = 24 }
59 The data Diet 1 Diet 2 Diet In fact these data come from Linear Models with R by J.J. Faraway (2004)
60 Rewrite the ANOVA model as a normal linear model Take µ 1 X = and β = µ µ
61 Data calculations ns 2 = 98, X T X = diag(n 1, n 2, n 3 ) = diag(4, 6, 8), ˆµ 1 61 ˆβ = ˆµ 2 = 66 ˆµ 3 61 Charlotte s posterior µ 1 { µ µ 3 D N 3 Ga B = , C = , τ } G = , H =
62 Posterior for treatment means Posterior distribution: µ D t 2G (B, H G C 1 ) Posterior mean E(µ D) = Posterior variance matrix V ar(µ D) = Posterior correlation matrix V ar(µ D) = 0.211
63 Prior and posterior distributions of treatment means density density µ 1 µ 2 density µ 3
64 Prior and posterior distributions for within-treatment group variation density density τ σ
65 95% confidence intervals µ 1 : (59.41, 63.82) µ 2 : (61.86, 65.84) µ 3 : (59.63, 63.28) τ: (0.050, 0.170) HDI (0.055, 0.177) σ: (2.291, 4.130) HDI (2.377, 4.270)
66 Alternative parameterisation of the model Standard model Y ij = µ i + ε ij = β = µ 2 µ 3 Alternative parametrisation Y ij = µ + α i + ε ij = µ β = α 2 α 3 with α 1 0 for identifiability and α 2 = µ 2 µ 1, α 3 = µ 3 µ 1 µ 1
67 Question What is the (joint) posterior distribution of the treatment mean differences α = (α 2, α 3 ) T, where α 2 = µ 2 µ 1, α 3 = µ 3 µ 1? Hints: 1 Posterior distribution: µ D t 2G (B, H G C 1 ) 2 If X t a (b, c), A is a q p matrix and d is a q 1 vector then AX + d t a (Ab + d, AcA T )
68 Hints: 1 Posterior distribution: µ D t 2G (B, H G C 1 ) 2 If X t a (b, c), A is a q p matrix and d is a q 1 vector then AX + d t a (Ab + d, AcA T ) We have ( α2 α 3 ) = i.e. α = Aµ ( ) 1 µ µ 2 µ 3 µ D t 2G ( B, H G C 1) α D t 2G ( AB, H G AC 1 A T )
69 Prior and posterior distributions for mean treatment differences density density α 2 α 3 95% HDIs α 2 : ( 0.373, 4.827) α 3 : ( 2.697, 2.365)
70 α e e Figure : 95%, 90% and 80% prior (dashed) and posterior (solid) confidence regions for α = (α 2, α 3 ) T ; 95% (outer), 80% (inner). α 2
71 Focusing on the central part of the plot... α Figure : 95%, 90% and 80% prior (dashed) and posterior (solid) confidence regions for α = (α 2, α 3 ) T ; 95% (outer), 80% (inner). α 2
72 Thanks for listening
Chapter 4. Bayesian inference. 4.1 Estimation. Point estimates. Interval estimates
Chapter 4 Bayesian inference The posterior distribution π(θ x) summarises all our information about θ to date. However, sometimes it is helpful to reduce this distribution to a few key summary measures.
More informationSTA 2201/442 Assignment 2
STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationUSEFUL PROPERTIES OF THE MULTIVARIATE NORMAL*
USEFUL PROPERTIES OF THE MULTIVARIATE NORMAL* 3 Conditionals and marginals For Bayesian analysis it is very useful to understand how to write joint, marginal, and conditional distributions for the multivariate
More informationProblem 1 (20) Log-normal. f(x) Cauchy
ORF 245. Rigollet Date: 11/21/2008 Problem 1 (20) f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 4 2 0 2 4 Normal (with mean -1) 4 2 0 2 4 Negative-exponential x x f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.5
More informationA Bayesian Treatment of Linear Gaussian Regression
A Bayesian Treatment of Linear Gaussian Regression Frank Wood December 3, 2009 Bayesian Approach to Classical Linear Regression In classical linear regression we have the following model y β, σ 2, X N(Xβ,
More informationST 740: Linear Models and Multivariate Normal Inference
ST 740: Linear Models and Multivariate Normal Inference Alyson Wilson Department of Statistics North Carolina State University November 4, 2013 A. Wilson (NCSU STAT) Linear Models November 4, 2013 1 /
More informationAn Introduction to Bayesian Linear Regression
An Introduction to Bayesian Linear Regression APPM 5720: Bayesian Computation Fall 2018 A SIMPLE LINEAR MODEL Suppose that we observe explanatory variables x 1, x 2,..., x n and dependent variables y 1,
More informationBayesian statistics, simulation and software
Module 4: Normal model, improper and conjugate priors Department of Mathematical Sciences Aalborg University 1/25 Another example: normal sample with known precision Heights of some Copenhageners in 1995:
More informationFoundations of Statistical Inference
Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 32 Lecture 14 : Variational Bayes
More informationBayesian Inference for Normal Mean
Al Nosedal. University of Toronto. November 18, 2015 Likelihood of Single Observation The conditional observation distribution of y µ is Normal with mean µ and variance σ 2, which is known. Its density
More informationBayesian Inference. Chapter 4: Regression and Hierarchical Models
Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School
More informationBayesian statistics, simulation and software
Module 10: Bayesian prediction and model checking Department of Mathematical Sciences Aalborg University 1/15 Prior predictions Suppose we want to predict future data x without observing any data x. Assume:
More informationBayesian Econometrics
Bayesian Econometrics Christopher A. Sims Princeton University sims@princeton.edu September 20, 2016 Outline I. The difference between Bayesian and non-bayesian inference. II. Confidence sets and confidence
More informationBayesian Inference. Chapter 4: Regression and Hierarchical Models
Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative
More informationPredictive Distributions
Predictive Distributions October 6, 2010 Hoff Chapter 4 5 October 5, 2010 Prior Predictive Distribution Before we observe the data, what do we expect the distribution of observations to be? p(y i ) = p(y
More informationFoundations of Statistical Inference
Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 20 Lecture 6 : Bayesian Inference
More informationModeling Real Estate Data using Quantile Regression
Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationChapter 8: Sampling distributions of estimators Sections
Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample variance Skip: p.
More informationBayesian Regression Linear and Logistic Regression
When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we
More informationA Very Brief Summary of Bayesian Inference, and Examples
A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X
More informationPart 4: Multi-parameter and normal models
Part 4: Multi-parameter and normal models 1 The normal model Perhaps the most useful (or utilized) probability model for data analysis is the normal distribution There are several reasons for this, e.g.,
More informationFinal Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.
1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. (a) If X and Y are independent, Corr(X, Y ) = 0. (b) (c) (d) (e) A consistent estimator must be asymptotically
More informationLecture 25: Review. Statistics 104. April 23, Colin Rundel
Lecture 25: Review Statistics 104 Colin Rundel April 23, 2012 Joint CDF F (x, y) = P [X x, Y y] = P [(X, Y ) lies south-west of the point (x, y)] Y (x,y) X Statistics 104 (Colin Rundel) Lecture 25 April
More informationINTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y
INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y Predictor or Independent variable x Model with error: for i = 1,..., n, y i = α + βx i + ε i ε i : independent errors (sampling, measurement,
More informationLecture 3. Univariate Bayesian inference: conjugate analysis
Summary Lecture 3. Univariate Bayesian inference: conjugate analysis 1. Posterior predictive distributions 2. Conjugate analysis for proportions 3. Posterior predictions for proportions 4. Conjugate analysis
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions
More informationLinear Models A linear model is defined by the expression
Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationCSC321 Lecture 18: Learning Probabilistic Models
CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling
More informationIntroduction to Bayesian Methods. Introduction to Bayesian Methods p.1/??
to Bayesian Methods Introduction to Bayesian Methods p.1/?? We develop the Bayesian paradigm for parametric inference. To this end, suppose we conduct (or wish to design) a study, in which the parameter
More informationSTAT 830 Bayesian Estimation
STAT 830 Bayesian Estimation Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Bayesian Estimation STAT 830 Fall 2011 1 / 23 Purposes of These
More informationBayesian Linear Models
Eric F. Lock UMN Division of Biostatistics, SPH elock@umn.edu 03/07/2018 Linear model For observations y 1,..., y n, the basic linear model is y i = x 1i β 1 +... + x pi β p + ɛ i, x 1i,..., x pi are predictors
More informationINTRODUCTION TO BAYESIAN STATISTICS
INTRODUCTION TO BAYESIAN STATISTICS Sarat C. Dass Department of Statistics & Probability Department of Computer Science & Engineering Michigan State University TOPICS The Bayesian Framework Different Types
More informationIntroduction to Probabilistic Graphical Models: Exercises
Introduction to Probabilistic Graphical Models: Exercises Cédric Archambeau Xerox Research Centre Europe cedric.archambeau@xrce.xerox.com Pascal Bootcamp Marseille, France, July 2010 Exercise 1: basics
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters
More informationConjugate Analysis for the Linear Model
Conjugate Analysis for the Linear Model If we have good prior knowledge that can help us specify priors for β and σ 2, we can use conjugate priors. Following the procedure in Christensen, Johnson, Branscum,
More informationBayesian linear regression
Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding
More informationBayesian Inference: Concept and Practice
Inference: Concept and Practice fundamentals Johan A. Elkink School of Politics & International Relations University College Dublin 5 June 2017 1 2 3 Bayes theorem In order to estimate the parameters of
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More informationPhysics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester
Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability
More informationBayesian Inference. Chapter 9. Linear models and regression
Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering
More informationPost-exam 2 practice questions 18.05, Spring 2014
Post-exam 2 practice questions 18.05, Spring 2014 Note: This is a set of practice problems for the material that came after exam 2. In preparing for the final you should use the previous review materials,
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationApproximate Bayesian computation for spatial extremes via open-faced sandwich adjustment
Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment Ben Shaby SAMSI August 3, 2010 Ben Shaby (SAMSI) OFS adjustment August 3, 2010 1 / 29 Outline 1 Introduction 2 Spatial
More informationComparing Non-informative Priors for Estimation and Prediction in Spatial Models
Environmentrics 00, 1 12 DOI: 10.1002/env.XXXX Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Regina Wu a and Cari G. Kaufman a Summary: Fitting a Bayesian model to spatial
More informationNew Bayesian methods for model comparison
Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison
More informationMultivariate Bayesian Linear Regression MLAI Lecture 11
Multivariate Bayesian Linear Regression MLAI Lecture 11 Neil D. Lawrence Department of Computer Science Sheffield University 21st October 2012 Outline Univariate Bayesian Linear Regression Multivariate
More informationStable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence
Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham NC 778-5 - Revised April,
More informationGeneralized Linear Models. Kurt Hornik
Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationBayesian Inference. Chapter 2: Conjugate models
Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in
More informationThe linear model is the most fundamental of all serious statistical models encompassing:
Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x
More informationMAS3301 Bayesian Statistics
MAS3301 Bayesian Statistics M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2008-9 1 15 Inference for Normal Distributions II 15.1 Student s t-distribution When we look
More informationBayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence
Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns
More information9 Bayesian inference. 9.1 Subjective probability
9 Bayesian inference 1702-1761 9.1 Subjective probability This is probability regarded as degree of belief. A subjective probability of an event A is assessed as p if you are prepared to stake pm to win
More informationFirst Year Examination Department of Statistics, University of Florida
First Year Examination Department of Statistics, University of Florida August 19, 010, 8:00 am - 1:00 noon Instructions: 1. You have four hours to answer questions in this examination.. You must show your
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public
More informationProblem Selected Scores
Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationBayesian RL Seminar. Chris Mansley September 9, 2008
Bayesian RL Seminar Chris Mansley September 9, 2008 Bayes Basic Probability One of the basic principles of probability theory, the chain rule, will allow us to derive most of the background material in
More informationSTAT215: Solutions for Homework 2
STAT25: Solutions for Homework 2 Due: Wednesday, Feb 4. (0 pt) Suppose we take one observation, X, from the discrete distribution, x 2 0 2 Pr(X x θ) ( θ)/4 θ/2 /2 (3 θ)/2 θ/4, 0 θ Find an unbiased estimator
More informationSTAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01
STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist
More informationLECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)
LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationBayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017
Bayesian inference Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 10, 2017 1 / 22 Outline for today A genetic example Bayes theorem Examples Priors Posterior summaries
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationStatistical Inference: Estimation and Confidence Intervals Hypothesis Testing
Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire
More informationRandom Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30
Random Vectors 1 STA442/2101 Fall 2017 1 See last slide for copyright information. 1 / 30 Background Reading: Renscher and Schaalje s Linear models in statistics Chapter 3 on Random Vectors and Matrices
More informationLinear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks.
Linear Mixed Models One-way layout Y = Xβ + Zb + ɛ where X and Z are specified design matrices, β is a vector of fixed effect coefficients, b and ɛ are random, mean zero, Gaussian if needed. Usually think
More informationPartial factor modeling: predictor-dependent shrinkage for linear regression
modeling: predictor-dependent shrinkage for linear Richard Hahn, Carlos Carvalho and Sayan Mukherjee JASA 2013 Review by Esther Salazar Duke University December, 2013 Factor framework The factor framework
More informationStatistical Inference: Maximum Likelihood and Bayesian Approaches
Statistical Inference: Maximum Likelihood and Bayesian Approaches Surya Tokdar From model to inference So a statistical analysis begins by setting up a model {f (x θ) : θ Θ} for data X. Next we observe
More informationB4 Estimation and Inference
B4 Estimation and Inference 6 Lectures Hilary Term 27 2 Tutorial Sheets A. Zisserman Overview Lectures 1 & 2: Introduction sensors, and basics of probability density functions for representing sensor error
More informationChapter 9: Interval Estimation and Confidence Sets Lecture 16: Confidence sets and credible sets
Chapter 9: Interval Estimation and Confidence Sets Lecture 16: Confidence sets and credible sets Confidence sets We consider a sample X from a population indexed by θ Θ R k. We are interested in ϑ, a vector-valued
More informationBayesian Inference for DSGE Models. Lawrence J. Christiano
Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.
More informationSome Curiosities Arising in Objective Bayesian Analysis
. Some Curiosities Arising in Objective Bayesian Analysis Jim Berger Duke University Statistical and Applied Mathematical Institute Yale University May 15, 2009 1 Three vignettes related to John s work
More informationSTA 302f16 Assignment Five 1
STA 30f16 Assignment Five 1 Except for Problem??, these problems are preparation for the quiz in tutorial on Thursday October 0th, and are not to be handed in As usual, at times you may be asked to prove
More informationLIST OF FORMULAS FOR STK1100 AND STK1110
LIST OF FORMULAS FOR STK1100 AND STK1110 (Version of 11. November 2015) 1. Probability Let A, B, A 1, A 2,..., B 1, B 2,... be events, that is, subsets of a sample space Ω. a) Axioms: A probability function
More informationOverall Objective Priors
Overall Objective Priors Jim Berger, Jose Bernardo and Dongchu Sun Duke University, University of Valencia and University of Missouri Recent advances in statistical inference: theory and case studies University
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More information18 Bivariate normal distribution I
8 Bivariate normal distribution I 8 Example Imagine firing arrows at a target Hopefully they will fall close to the target centre As we fire more arrows we find a high density near the centre and fewer
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationLinear Regression (9/11/13)
STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter
More informationMAS3301 Bayesian Statistics
MAS3301 Bayesian Statistics M. Farrow School of Mathematics and Statistics Newcastle University Semester, 008-9 1 13 Sequential updating 13.1 Theory We have seen how we can change our beliefs about an
More informationF & B Approaches to a simple model
A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationGibbs Sampling in Linear Models #2
Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling
More informationMaking rating curves - the Bayesian approach
Making rating curves - the Bayesian approach Rating curves what is wanted? A best estimate of the relationship between stage and discharge at a given place in a river. The relationship should be on the
More informationVariational Bayes. A key quantity in Bayesian inference is the marginal likelihood of a set of data D given a model M
A key quantity in Bayesian inference is the marginal likelihood of a set of data D given a model M PD M = PD θ, MPθ Mdθ Lecture 14 : Variational Bayes where θ are the parameters of the model and Pθ M is
More informationMIT Spring 2015
Regression Analysis MIT 18.472 Dr. Kempthorne Spring 2015 1 Outline Regression Analysis 1 Regression Analysis 2 Multiple Linear Regression: Setup Data Set n cases i = 1, 2,..., n 1 Response (dependent)
More informationThe Relationship Between the Power Prior and Hierarchical Models
Bayesian Analysis 006, Number 3, pp. 55 574 The Relationship Between the Power Prior and Hierarchical Models Ming-Hui Chen, and Joseph G. Ibrahim Abstract. The power prior has emerged as a useful informative
More informationLecture 3. Probability - Part 2. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. October 19, 2016
Lecture 3 Probability - Part 2 Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza October 19, 2016 Luigi Freda ( La Sapienza University) Lecture 3 October 19, 2016 1 / 46 Outline 1 Common Continuous
More informationGeneral Bayesian Inference I
General Bayesian Inference I Outline: Basic concepts, One-parameter models, Noninformative priors. Reading: Chapters 10 and 11 in Kay-I. (Occasional) Simplified Notation. When there is no potential for
More informationChapter 14. Linear least squares
Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given
More information10. Exchangeability and hierarchical models Objective. Recommended reading
10. Exchangeability and hierarchical models Objective Introduce exchangeability and its relation to Bayesian hierarchical models. Show how to fit such models using fully and empirical Bayesian methods.
More information