Part III. A Decision-Theoretic Approach and Bayesian testing

Size: px
Start display at page:

Download "Part III. A Decision-Theoretic Approach and Bayesian testing"

Transcription

1 Part III A Decision-Theoretic Approach and Bayesian testing 1

2 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to choose between various possible actions after observing data. Let Θ denote the set of all possible states of nature values of parameter), and let D denote the set of all possible decisions actions). A loss function is any function L : Θ D [0, ); the idea is that Lθ, d) gives the cost penalty) associated with decision d if the true state of the world is θ Inference as a decision problem Denote by fx, θ) the sampling distribution, for a sample x X. Let Lθ, δ) be our loss function. Often the decision d is to evaluate or estimate a function hθ) as accurately as possible. For point estimation, we want to evaluate hθ) = θ; our set of decisions is D = Θ, the parameter space; and Lθ, d) is the loss in reporting d when θ is true. For hypothesis testing, if we want to test H 0 : θ Θ 0, then our possible decisions are D = {accept H 0, reject H 0 }. 2

3 We would like to evaluate the function hθ) = { 1 if θ Θ0 0 otherwise as accurately as possible. Our loss function is Lθ, accept H 0 ) = Lθ, reject H 0 ) = { l00 if θ Θ 0 l 01 otherwise { l10 if θ Θ 0 l 11 otherwise. Note: l 01 is the Type II-error, accept H 0 although false), and l 10 is the Type I-error reject H 0 although true) Decision rules and Bayes estimators In general we would like to find a decision rule: δ : X D, a function which makes a decision based on the data. We would like to choose δ such that we incur only a small loss. In general there is no δ that uniformly mimimizes Lθ, δx)). In a Bayesian setting, for a prior π and data x X, the posterior expected loss of a decision is a function of the data x, defined as ρπ, d x) = Lθ, d)πθ x)dθ. Θ For a prior π the integrated risk of a decision rule δ is the real number defined as rπ, δ) = Lθ, δx)fx θ)dxπθ)dθ. We prefer δ 1 to δ 2 if and only if rπ, δ 1 ) < rπ, δ 2 ). Θ X Proposition. An estimator minimizing rπ, δ) can be obtained by selecting, for every x X, the value δx) that minimizes ρπ, δ x). 3

4 Proof additional material) rπ, δ) = Lθ, δx))fx θ)dxπθ)dθ Θ X = Lθ, δx))πθ x)px)dθdx = X Θ X ρπ, δ x)px)dx Recall that px) = fx θ)πθ)dθ).) This proves the assertion. Θ A Bayes estimator associated with prior π, loss L, is any estimator δ π which minimizes rπ, δ): For every x X it is δ π = arg min ρπ, d x). d Then rπ) = rπ, δ π ) is called Bayes risk. This is valid for proper priors, and for improper priors if rπ) <. If rπ) = one can define a generalized Bayes estimator as the minimizer, for every x, of ρπ, d x). Fact: For strictly convex loss functions, Bayes estimators are unique Some common loss functions In principle the loss function is part of the problem specification. A very popular choice is squared error loss Lθ, d) = θ d) 2. This loss function is convex, and penalizes large deviations heavily. Proposition. The Bayes estimator δ π associated with prior π under squared error loss is the posterior mean, δ π x) = E π Θ θ x) = θfx θ)πθ)dθ Θ fx θ)πθ)dθ. To see this, recall that for any random variable Y, EY a) 2 ) is minimized by a = EY. Another common choice for the loss function is absolute error loss Lθ, d) = θ d. Proposition: The posterior median is a Bayes estimator under absolute error loss. 4

5 10.4 Bayesian testing Let fx, θ) be our sampling distribution, x X, θ Θ, and suppose that we cant to test H 0 : θ Θ 0. Then the set of possible decisions is D = {accept H 0, reject H 0 } = {1, 0}, where 1 stands for acceptance. We use the loss function Lθ, φ) = 0 if θ Θ 0, φ = 1 a 0 if θ Θ 0, φ = 0 0 if θ Θ 0, φ = 0 a 1 if θ Θ 0, φ = 1. Proposition Under this loss function, the Bayes decision rule associated with a prior distribution π is { 1 if P φ π π θ Θ 0 x) > a 1 x) = a 0 +a 1 0 otherwise Note: In the special case that a 0 = a 1, the rule states that we accept H 0 if P π θ Θ 0 x) > 1 2. Proof additional material) The posterior expected loss is ρπ, φ x) = a 0 P π θ Θ 0 x)1φx) = 0) + a 1 P π θ Θ 0 x)1φx) = 1) = a 0 P π θ Θ 0 x) + 1φx) = 1) a 1 a 0 + a 1 )P π θ Θ 0 x)), and a 1 a 0 + a 1 )P π θ Θ 0 x) < 0 if and only if P π θ Θ 0 x) > a 1 a 0 +a 1. Example: X Binn, θ), Θ 0 = [0, 1/2), πθ) = 1; then P π θ < 1 2 x ) = = θx 1 θ) n x dθ 1 0 θx 1 θ) n x dθ 1 n+1 2) Bx + 1, n x + 1) { } 1 n x)!x! , x + 1 n + 1)! which can be evaluated for particular n and x, and compared with the acceptance level a 1 a 0 +a 1. 5

6 Example: X N θ, σ 2 ), σ 2 is known, θ N µ, τ 2 ). We already calculated To test H 0 : θ < 0: Let z a0,a 1 be the equivalently, if πθ x) N µx), w 2 ) µx) = σ2 µ + τ 2 x σ 2 + τ 2 w 2 σ 2 τ 2 = σ 2 + τ. 2 P π θ < 0 x) = P π θ µx) w a 1 a 0 +a 1 < µx) ) = Φ µx) ) w w quantile; then we accept H 0 if µx) > z a0,a 1 w, or, x < σ2 τ 2 µ 1 + σ2 τ 2 ) z a0,a 1 w. For σ 2 = 1, µ = 0, τ 2, we accept H 0 if x < z a0,a 1. Compare this to the frequentist test: Agian σ 2 = 1. Accept H 0 if x < z 1 α = z α. This corresponds to a 0 a 1 = 1 1. So a 0 α a 1 = 19 for α = 0.05, and = 99 for α = 0.01, for example. a 0 a 1 Note: If the prior probability of H 0 is 0, then so will be posterior probability. Testing H 0 : θ = θ 0 against H 1 : θ > θ 0 often really means testing H 0 : θ θ 0 against H 1 : θ > θ 0, which is natural to test in a Bayesian setting. is Definition: The Bayes factor for testing H 0 : θ Θ 0 against H 1 : θ Θ 1 B π x) = P π θ Θ 0 x)/p π θ Θ 1 x). P π θ Θ 0 )/P π θ Θ 1 ) It measures the extent to which the data x will change the odds of Θ 0 relative to Θ 1. 6

7 If B π x) > 1 the data adds support to H 0 ; if B π x) < 1 the data adds support to H 1 ; if B π x) = 1 the data does not help to distinguish between H 0 and H 1. Note that the Bayes factor still depends on the prior π. Special case: If H 0 : θ = θ 0, H 1 : θ = θ 1, then is the likelihood ratio. B π x) = fx θ 0) fx θ 1 ) More generally, / B π x) = Θ 0 πθ)fx θ)dθ Θ 0 πθ)dθ Θ 1 πθ)fx θ)dθ Θ 1 πθ)dθ = Θ 0 πθ)fx θ)/p π θ Θ 0 )dθ Θ 1 πθ)fx θ))/p π θ Θ 1 )dθ = px θ Θ 0) px θ Θ 1 ) is the ratio of how likely the data is under H 0 and how likely the data is under H 1. Compare this to the frequentist likelihood ratio Λx) = max θ Θ 0 fx θ) max θ Θ1 fx θ). In Bayesian statistics we average instead of taking maxima. Note: with φ π from the Proposition, and ρ 0 = P π θ Θ 0 ), ρ 1 = P π θ Θ 1 ), B π x) = P π θ Θ 0 x)/1 P π θ Θ 0 x)) ρ 0 /ρ 1 and so / φ π x) = 1 B π x) > a 1 ρ 0. a 0 ρ 1 7

8 Also, by inverting the equality it follows that P π θ Θ 0 x) = 1 + ρ ) 1 1 B π x)) 1. ρ 0 Example: Let X Binn, p), H 0 : p = 1/2, H 1 : p 1/2. Choose as prior an atom of size ρ 0 at 1/2, otherwise uniform; then B π x) = px p = 1/2) px p Θ 1 ) = n ) x 2 n n. x) Bx + 1, n x + 1) So P p = 12 ) x = If ρ 0 = 1/2, n = 5, x = 3: B π x) = 15 8 P p = 12 ) x = ρ 0) ρ 0 > 1, then ) 1 x!n x)! 2 n. n 1)! ) = ; so the data adds support to H 0, the posterior probability of H 0 is 15/23 > 1/2. We could have chosen an alternatively prior: atom of size ρ 0 at 1/2, otherwise Beta1/2, 1/2); this prior favours 0 and 1. Then we obtain for n=10: x P p = 1 x) Example: Let X N θ, σ 2 ), σ 2 is known, H 0 : θ = 0. We choose as prior mass ρ 0 at θ = 0, otherwise N 0, τ 2 ). Then B π ) 1 = px θ 0) px θ = 0) = σ2 + τ 2 ) 1/2 exp{ x 2 /2σ 2 + τ 2 ))} σ 1 exp{ x 2 /2σ 2 )} 8

9 and P θ = 0 x) = ρ 0 ρ 0 σ 2 σ 2 + τ exp 2 τ 2 x 2 2σ 2 σ 2 + τ 2 ) ) ) 1. Example: ρ 0 = 1/2, τ = σ, put z = x/σ x P θ = 0 z) For τ = 10σ more diffusive prior) x P θ = 0 z) so x gives stronger support for H 0 than under the tighter prior. Note: For x fixed, τ 2, ρ 0 > 0, we have P θ = 0 x) 1. For the noninformative prior πθ) 1 we have that px πθ) = 2πσ 2 ) 1/2 e x θ)2 2σ 2 dθ = 2πσ 2 ) 1/2 e θ x)2 2σ 2 dθ and so P θ = 0 x) = which is not equal to 1. = ρ ) 1 0 2π expx 2 /2) ρ 0 9

10 Lindley s paradox: Let X N θ, σ 2 x /n), H 0 : θ = 0, n is fixed. If σ/ n) is large enough to reject H 0 in classical test, then for large enough τ 2 the Bayes factor will be larger than 1, indicating support for H 0. In contrast, if σ 2, τ 2 x are fixed, and n such that σ/ = k n) α fixed, which is just significant at level α in classical test, then B π x). Results which are just significant at some fixed level in the classical test will, for large n, actually be much more likely under H 0 than under H 1. A very diffusive prior proclaims great scepticism, which may overwhelm the contrary evidence of the observations Least favourable Bayesian answers The choice of prior affects the result. Suppose that we want to test H 0 : θ = θ 0 against H 1 : θ θ 0, and the prior probability on H 0 is ρ 0 = 1/2. Which prior g in H 1, after observing x, would be least favourable to H 0? Let G be a family of priors on H 1 ; put and P x, G) = Bx, G) = inf g G fx θ 0 ) Θ fx θ)gθ)dθ ) 1 fx θ 0 ) fx θ 0 ) + sup fx θ)gθ)dθ = g G Bx, G) Θ A Bayesian prior g G on H 0 will then have posterior probability at least P x, G) on H 0 for ρ 0 = 1/2). If ˆθ is the m.l.e. of θ, and if G A is the set of all prior distributions, then Bx, G A ) = fx θ 0) fx ˆθx)) and P x, G A ) = 1 + fx ˆθx)) fx θ 0 ) 1 10

11 Other natural families are G S the set of distributions symmetric around θ 0, and G SU the set of unimodal distributions symmetric around θ 0. Example: Let X N θ, 1), H 0 : θ = θ 0, H 1 : θ θ 0, then we can calculate p-value P x, G A ) P x, G SU ) The Bayesian approach will typically reject H 0 less frequently than the frequentist approach Consider the general test problem H 0 : θ = θ 0 against H 1 : θ = θ 1, and consider repetitions in which one uses the most powerful test with level α = In frequentist analysis, only 1% of the true H 0 will be rejected. But this does not say anything about the proportion of errors made when rejecting! Example: Suppose the probability of type II error is 0.99, and θ 0 and θ 1 occur equally often, then about half of the rejections of H 0 will be in error Comparison with frequentist hypothesis testing Frequentist hypothesis testing: Asymmetry between H 0, H 1 : fix type I error, minimize type II error; UMP tests do not always exist general 2-sided tests, e.g.); p-values: - have no intrinsic optimality, space of p-values lacks a decision-theoretic foundation - are routinely misinterpreted - do not take type II error into account; confidence regions: are a pre-data measure, can often have very different post data coverage probabilities. 11

12 Example: Let X N θ, 1/2), H 0 : θ = 1, H 1 : θ = 1, and suppose that x = 0. Then the UMP p-value is 0.072, but the p-value for the test of H 1 against H 0 takes exactly the same value. Example: Let X 1,..., X n be i.i.d. N θ, σ 2 ), both θ, σ 2 are unknown. Then a 1001 α) confidence for θ is ) s s C = x t α/2, x + t α/2. n n For n = 2, α = 0.5, the predata coverage probability is 0.5. However, Brown Ann.Math.Stat. 38, 1967, ) showed that P θ C x /s < 1 + 2) > 2/3. Bayesian testing compares the probability of the actual data under the two hypotheses 12

13 Chapter 11 Hierarchical and empirical Bayesian methods 11.1 Hierarchical Bayes: A hierarchical model consists of modelling a parameter θ through randomness at different levels; for example, θ β π 1 θ β), where β π 2 β); so that then πθ) = π 1 θ β)π 2 β)dβ. When dealing with complicated posterior distributions, rather than evaluating the integrals, we might use simulation to approximate the integrals. For simulation in hierarchical models, we simulate first from β, then, given β, we simulate from θ. We hope that the distribution of β is easy to simulate, and also that the conditional distribution of θ given β is easy to simulate. This approach is particularly useful for MCMC Markov chain Monte Carlo) methods, e.g.: see next term Empirical Bayes: Let x fx θ). The empirical Bayes method chooses a convenient prior family πθ λ) typically conjugate), where λ is a hyperparameter, so px λ) = fx θ)πθ λ)dθ. 13

14 Rather than specifying λ, we estimate λ by ˆλ, for example by frequentist methods, based on px λ), and we substitute ˆλ for λ; πθ x, ˆλ) is called a pseudo-posterior. We plug it into Bayes Theorem for inference. The empirical Bayes approach * is neither fully Bayesian nor fully frequentist; * depends on ˆλ, different ˆλ will lead to different procedures; * if ˆλ is consistent, then asymptotically will lead to coherent Bayesian analysis. * often outperforms classical estimators in empirical terms. Example: James-Stein estimators Let X i N θ i, 1) be independent given θ i, i = 1,..., p,where p 3. In vector notation: X N θ, I p ). Here the vector θ is random; assume that we have realizations θ i, i = 1,..., p. The obvious estimate which is the least-squares estimate) for θ i is ˆθ i = x i, leading to ˆθ = X. Abbreviate X = x. Our decision rule would hence be δx) = x. But this is not the best rule! Assume that θ i N 0, τ 2 ), then px τ 2 ) = N 0, 1 + τ 2 )I p ), and the posterior for θ given the data is ) τ 2 θ x N 1 + τ x, τ I 2 p. Under quadratic loss, the Bayes estimator δx) of θ is the posterior mean τ τ 2 x. In the empirical Bayes approach, we would use the m.l.e. for τ 2, which is ) τˆ x 2 2 = 1 1 x 2 > p), p 14

15 and the empirical Bayes estimator is the estimated posterior mean, δ EB x) = ˆτ ˆτ x = 2 1 p ) + x x 2 is the truncated James-Stein estimator. It can can be shown to outperform the estimator δx) = x. Alternatively, the best unbiased estimator of 1/1 + τ 2 ) is p 2 x 2, giving δ EB x) = 1 p ) x. x 2 This is the James-Stein estimator. It can be shown that under quadratic loss function the James-Stein estimator has smaller integrated risk than δx) = x. Note: both estimators tend to shrink towards 0. It is now known to be a very general phenomenon that when comparing three or more populations, the sample mean is not the best estimator. Shrinkage estimators are an active area of research. Bayesian computation of posterior probabilities can be very computerintensive; see the MCMC and Applied Bayesian Statistics course. 15

16 Chapter 12 Principles of Inference 12.1 The Likelihood Principle The Likelihood Principle: The information brought by an observation x about θ is entirely contained in the likelihood function Lθ x). From this follows: if x 1 and x 2 are two observations with likelihoods L 1 θ x) and L 2 θ x), and if L 1 θ x) = cx 1, x 2 )L 2 θ x) then x 1 and x 2 must lead to identical inferences. Example. We know that πθ x) fx θ)πθ). If f 1 x θ) f 2 x θ) as a function of θ, then they have the same posterior, so they lead to the same Bayesian inference. Example: Binomial versus negative binomial. a) Let X Binn, θ) be the number of successes in n independent trials, with p.m.f. ) n f 1 x θ) = θ x 1 θ) n x x then πθ x) ) n θ x 1 θ) n x πθ) x θ x 1 θ) n x πθ). 16

17 b) Let N NegBinx, θ) be the number of independent trials until x successes, with p.m.f. ) n 1 f 2 n θ) = θ x 1 θ) n x x 1 and πθ x) θ x 1 θ) n x πθ). Bayesian inference about θ does not depend on whether a binomial or a negative binomial sampling scheme was used. M.l.e. s satisfy the likelihood principle, but many frequentist procedures do not! Example: Binn, θ)-sampling. We observe x 1,..., x n ) = 0,..., 0, 1). An unbiased estimate for θ is ˆθ = 1/n. If instead we view n as geometricθ), then the only unbiased estimator for θ is ˆθ = 1n = 1). Unbiasedness typically violates the likelihood principle: it involves integrals over the sample space, so it depends on the value of fx θ) for values of x other than the observed value. Example: a) We observe a Binomial randome variable, n=12; we observe 9 heads, 3 tails. Suppose that we want to test H 0 : θ = 1/2 against H 1 : θ > 1/2. We can calculate that the UMP test has P X 9) = b) If instead we continue tossing until 3 tails recorded, and observe that N = 12 tosses are needed, then the underlying distribution is negative binomial, and P N 12) = The conditionality perspective Example Cox 1958) A scientist wants to measure a physical quantity θ. Machine 1 gives measurements X 1 N θ, 1), but is often busy. Machine 2 gives measurements X 1 N θ, 100). The availability of machine 1 is beyond the scientist s control, independent of object to be measured. Assume that on any given occasion machine 1 is available with probability 1/2; if available, the scientist chooses machine 1. 17

18 A standard 95% confidence interval is about x 16.4, x ) because of the possibility that machine 2 was used. Conditionality Principle: If two experiments on the parameter θ are available, and if one of these two experiments is selected with probability 1/2, then the resulting inference on θ should only depend on the selected experiment. The conditionality principle is satisfied in Bayesian analysis. In the frequentist approach, we could condition on an ancillary statistic, but such statistic is not always available. A related principle is the Stopping rule principle SRP): A it stopping rule is a random variable that tells when to stop the experiment; this random variable depends only on the outcome of the first n experiments does not look into the future). The stopping rule principle states that if a sequence of experiments is directed by a stopping rule, then, given the resulting sample, the inference about θ should not depend on the nature of the stopping rule. The likelihood principle implies the SRP. The SRP is satisfied in Bayesian inference, but it is not always satisfied in frequentist analysis. 18

A Very Brief Summary of Bayesian Inference, and Examples

A Very Brief Summary of Bayesian Inference, and Examples A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution. Hypothesis Testing Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution. Suppose the family of population distributions is indexed

More information

Lecture 7 October 13

Lecture 7 October 13 STATS 300A: Theory of Statistics Fall 2015 Lecture 7 October 13 Lecturer: Lester Mackey Scribe: Jing Miao and Xiuyuan Lu 7.1 Recap So far, we have investigated various criteria for optimal inference. We

More information

Module 22: Bayesian Methods Lecture 9 A: Default prior selection

Module 22: Bayesian Methods Lecture 9 A: Default prior selection Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington Outline Jeffreys prior Unit information priors Empirical

More information

MIT Spring 2016

MIT Spring 2016 MIT 18.655 Dr. Kempthorne Spring 2016 1 MIT 18.655 Outline 1 2 MIT 18.655 Decision Problem: Basic Components P = {P θ : θ Θ} : parametric model. Θ = {θ}: Parameter space. A{a} : Action space. L(θ, a) :

More information

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b) LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered

More information

STA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources

STA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources STA 732: Inference Notes 10. Parameter Estimation from a Decision Theoretic Angle Other resources 1 Statistical rules, loss and risk We saw that a major focus of classical statistics is comparing various

More information

Introduction to Bayesian Methods. Introduction to Bayesian Methods p.1/??

Introduction to Bayesian Methods. Introduction to Bayesian Methods p.1/?? to Bayesian Methods Introduction to Bayesian Methods p.1/?? We develop the Bayesian paradigm for parametric inference. To this end, suppose we conduct (or wish to design) a study, in which the parameter

More information

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist

More information

COS513 LECTURE 8 STATISTICAL CONCEPTS

COS513 LECTURE 8 STATISTICAL CONCEPTS COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions

More information

STAT215: Solutions for Homework 2

STAT215: Solutions for Homework 2 STAT25: Solutions for Homework 2 Due: Wednesday, Feb 4. (0 pt) Suppose we take one observation, X, from the discrete distribution, x 2 0 2 Pr(X x θ) ( θ)/4 θ/2 /2 (3 θ)/2 θ/4, 0 θ Find an unbiased estimator

More information

Bayesian Inference: Posterior Intervals

Bayesian Inference: Posterior Intervals Bayesian Inference: Posterior Intervals Simple values like the posterior mean E[θ X] and posterior variance var[θ X] can be useful in learning about θ. Quantiles of π(θ X) (especially the posterior median)

More information

g-priors for Linear Regression

g-priors for Linear Regression Stat60: Bayesian Modeling and Inference Lecture Date: March 15, 010 g-priors for Linear Regression Lecturer: Michael I. Jordan Scribe: Andrew H. Chan 1 Linear regression and g-priors In the last lecture,

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 CS students: don t forget to re-register in CS-535D. Even if you just audit this course, please do register.

More information

Lecture 2: Statistical Decision Theory (Part I)

Lecture 2: Statistical Decision Theory (Part I) Lecture 2: Statistical Decision Theory (Part I) Hao Helen Zhang Hao Helen Zhang Lecture 2: Statistical Decision Theory (Part I) 1 / 35 Outline of This Note Part I: Statistics Decision Theory (from Statistical

More information

Parameter Estimation

Parameter Estimation Parameter Estimation Chapters 13-15 Stat 477 - Loss Models Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 1 / 23 Methods for parameter estimation Methods for parameter estimation Methods

More information

9 Bayesian inference. 9.1 Subjective probability

9 Bayesian inference. 9.1 Subjective probability 9 Bayesian inference 1702-1761 9.1 Subjective probability This is probability regarded as degree of belief. A subjective probability of an event A is assessed as p if you are prepared to stake pm to win

More information

Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation

Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation So far we have discussed types of spatial data, some basic modeling frameworks and exploratory techniques. We have not discussed

More information

Statistical Approaches to Learning and Discovery. Week 4: Decision Theory and Risk Minimization. February 3, 2003

Statistical Approaches to Learning and Discovery. Week 4: Decision Theory and Risk Minimization. February 3, 2003 Statistical Approaches to Learning and Discovery Week 4: Decision Theory and Risk Minimization February 3, 2003 Recall From Last Time Bayesian expected loss is ρ(π, a) = E π [L(θ, a)] = L(θ, a) df π (θ)

More information

ST 740: Model Selection

ST 740: Model Selection ST 740: Model Selection Alyson Wilson Department of Statistics North Carolina State University November 25, 2013 A. Wilson (NCSU Statistics) Model Selection November 25, 2013 1 / 29 Formal Bayesian Model

More information

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation. PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.. Beta Distribution We ll start by learning about the Beta distribution, since we end up using

More information

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling Due: Tuesday, May 10, 2016, at 6pm (Submit via NYU Classes) Instructions: Your answers to the questions below, including

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

Introduction to Bayesian Statistics 1

Introduction to Bayesian Statistics 1 Introduction to Bayesian Statistics 1 STA 442/2101 Fall 2018 1 This slide show is an open-source document. See last slide for copyright information. 1 / 42 Thomas Bayes (1701-1761) Image from the Wikipedia

More information

1 Hypothesis Testing and Model Selection

1 Hypothesis Testing and Model Selection A Short Course on Bayesian Inference (based on An Introduction to Bayesian Analysis: Theory and Methods by Ghosh, Delampady and Samanta) Module 6: From Chapter 6 of GDS 1 Hypothesis Testing and Model Selection

More information

Two examples of the use of fuzzy set theory in statistics. Glen Meeden University of Minnesota.

Two examples of the use of fuzzy set theory in statistics. Glen Meeden University of Minnesota. Two examples of the use of fuzzy set theory in statistics Glen Meeden University of Minnesota http://www.stat.umn.edu/~glen/talks 1 Fuzzy set theory Fuzzy set theory was introduced by Zadeh in (1965) as

More information

Chapter 4. Bayesian inference. 4.1 Estimation. Point estimates. Interval estimates

Chapter 4. Bayesian inference. 4.1 Estimation. Point estimates. Interval estimates Chapter 4 Bayesian inference The posterior distribution π(θ x) summarises all our information about θ to date. However, sometimes it is helpful to reduce this distribution to a few key summary measures.

More information

Bayesian Statistics. Debdeep Pati Florida State University. February 11, 2016

Bayesian Statistics. Debdeep Pati Florida State University. February 11, 2016 Bayesian Statistics Debdeep Pati Florida State University February 11, 2016 Historical Background Historical Background Historical Background Brief History of Bayesian Statistics 1764-1838: called probability

More information

Lecture 8 October Bayes Estimators and Average Risk Optimality

Lecture 8 October Bayes Estimators and Average Risk Optimality STATS 300A: Theory of Statistics Fall 205 Lecture 8 October 5 Lecturer: Lester Mackey Scribe: Hongseok Namkoong, Phan Minh Nguyen Warning: These notes may contain factual and/or typographic errors. 8.

More information

General Bayesian Inference I

General Bayesian Inference I General Bayesian Inference I Outline: Basic concepts, One-parameter models, Noninformative priors. Reading: Chapters 10 and 11 in Kay-I. (Occasional) Simplified Notation. When there is no potential for

More information

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007 Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.

More information

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes Neyman-Pearson paradigm. Suppose that a researcher is interested in whether the new drug works. The process of determining whether the outcome of the experiment points to yes or no is called hypothesis

More information

STAT 830 Decision Theory and Bayesian Methods

STAT 830 Decision Theory and Bayesian Methods STAT 830 Decision Theory and Bayesian Methods Example: Decide between 4 modes of transportation to work: B = Ride my bike. C = Take the car. T = Use public transit. H = Stay home. Costs depend on weather:

More information

Principles of Statistics

Principles of Statistics Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 81 Paper 4, Section II 28K Let g : R R be an unknown function, twice continuously differentiable with g (x) M for

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

Interval Estimation. Chapter 9

Interval Estimation. Chapter 9 Chapter 9 Interval Estimation 9.1 Introduction Definition 9.1.1 An interval estimate of a real-values parameter θ is any pair of functions, L(x 1,..., x n ) and U(x 1,..., x n ), of a sample that satisfy

More information

Weakness of Beta priors (or conjugate priors in general) They can only represent a limited range of prior beliefs. For example... There are no bimodal beta distributions (except when the modes are at 0

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 20 Lecture 6 : Bayesian Inference

More information

Bayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017

Bayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017 Chalmers April 6, 2017 Bayesian philosophy Bayesian philosophy Bayesian statistics versus classical statistics: War or co-existence? Classical statistics: Models have variables and parameters; these are

More information

10. Exchangeability and hierarchical models Objective. Recommended reading

10. Exchangeability and hierarchical models Objective. Recommended reading 10. Exchangeability and hierarchical models Objective Introduce exchangeability and its relation to Bayesian hierarchical models. Show how to fit such models using fully and empirical Bayesian methods.

More information

Statistical Inference

Statistical Inference Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Spring, 2006 1. DeGroot 1973 In (DeGroot 1973), Morrie DeGroot considers testing the

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Bayesian Statistics Script for the course in spring 2014

Bayesian Statistics Script for the course in spring 2014 Bayesian Statistics Script for the course in spring 2014 Hansruedi Künsch Seminar für Statistik, ETH Zurich Version of August 6, 2014 Contents 1 Basic concepts 1 1.1 Interpretations of probability..........................

More information

Introduction to Bayesian Statistics

Introduction to Bayesian Statistics Bayesian Parameter Estimation Introduction to Bayesian Statistics Harvey Thornburg Center for Computer Research in Music and Acoustics (CCRMA) Department of Music, Stanford University Stanford, California

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Answer Key for STAT 200B HW No. 7

Answer Key for STAT 200B HW No. 7 Answer Key for STAT 200B HW No. 7 May 5, 2007 Problem 2.2 p. 649 Assuming binomial 2-sample model ˆπ =.75, ˆπ 2 =.6. a ˆτ = ˆπ 2 ˆπ =.5. From Ex. 2.5a on page 644: ˆπ ˆπ + ˆπ 2 ˆπ 2.75.25.6.4 = + =.087;

More information

Frequentist Statistics and Hypothesis Testing Spring

Frequentist Statistics and Hypothesis Testing Spring Frequentist Statistics and Hypothesis Testing 18.05 Spring 2018 http://xkcd.com/539/ Agenda Introduction to the frequentist way of life. What is a statistic? NHST ingredients; rejection regions Simple

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Hierarchical Models & Bayesian Model Selection

Hierarchical Models & Bayesian Model Selection Hierarchical Models & Bayesian Model Selection Geoffrey Roeder Departments of Computer Science and Statistics University of British Columbia Jan. 20, 2016 Contact information Please report any typos or

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

Introduction to Bayesian Methods

Introduction to Bayesian Methods Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006 Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)

More information

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1 Chapter 4 HOMEWORK ASSIGNMENTS These homeworks may be modified as the semester progresses. It is your responsibility to keep up to date with the correctly assigned homeworks. There may be some errors in

More information

Statistical Theory MT 2007 Problems 4: Solution sketches

Statistical Theory MT 2007 Problems 4: Solution sketches Statistical Theory MT 007 Problems 4: Solution sketches 1. Consider a 1-parameter exponential family model with density f(x θ) = f(x)g(θ)exp{cφ(θ)h(x)}, x X. Suppose that the prior distribution has the

More information

Bayesian inference: what it means and why we care

Bayesian inference: what it means and why we care Bayesian inference: what it means and why we care Robin J. Ryder Centre de Recherche en Mathématiques de la Décision Université Paris-Dauphine 6 November 2017 Mathematical Coffees Robin Ryder (Dauphine)

More information

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain 0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher

More information

A BAYESIAN MATHEMATICAL STATISTICS PRIMER. José M. Bernardo Universitat de València, Spain

A BAYESIAN MATHEMATICAL STATISTICS PRIMER. José M. Bernardo Universitat de València, Spain A BAYESIAN MATHEMATICAL STATISTICS PRIMER José M. Bernardo Universitat de València, Spain jose.m.bernardo@uv.es Bayesian Statistics is typically taught, if at all, after a prior exposure to frequentist

More information

Integrated Objective Bayesian Estimation and Hypothesis Testing

Integrated Objective Bayesian Estimation and Hypothesis Testing Integrated Objective Bayesian Estimation and Hypothesis Testing José M. Bernardo Universitat de València, Spain jose.m.bernardo@uv.es 9th Valencia International Meeting on Bayesian Statistics Benidorm

More information

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio Estimation of reliability parameters from Experimental data (Parte 2) This lecture Life test (t 1,t 2,...,t n ) Estimate θ of f T t θ For example: λ of f T (t)= λe - λt Classical approach (frequentist

More information

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference 1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE

More information

Fundamental Theory of Statistical Inference

Fundamental Theory of Statistical Inference Fundamental Theory of Statistical Inference G. A. Young January 2012 Preface These notes cover the essential material of the LTCC course Fundamental Theory of Statistical Inference. They are extracted

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Lecture 2: Priors and Conjugacy

Lecture 2: Priors and Conjugacy Lecture 2: Priors and Conjugacy Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de May 6, 2014 Some nice courses Fred A. Hamprecht (Heidelberg U.) https://www.youtube.com/watch?v=j66rrnzzkow Michael I.

More information

1 Hierarchical Bayes and Empirical Bayes. MLII Method.

1 Hierarchical Bayes and Empirical Bayes. MLII Method. ISyE8843A, Brani Vidakovic Handout 8 Hierarchical Bayes and Empirical Bayes. MLII Method. Hierarchical Bayes and Empirical Bayes are related by their goals, but quite different by the methods of how these

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Bayes and Empirical Bayes Estimation of the Scale Parameter of the Gamma Distribution under Balanced Loss Functions

Bayes and Empirical Bayes Estimation of the Scale Parameter of the Gamma Distribution under Balanced Loss Functions The Korean Communications in Statistics Vol. 14 No. 1, 2007, pp. 71 80 Bayes and Empirical Bayes Estimation of the Scale Parameter of the Gamma Distribution under Balanced Loss Functions R. Rezaeian 1)

More information

Lecture 3: More on regularization. Bayesian vs maximum likelihood learning

Lecture 3: More on regularization. Bayesian vs maximum likelihood learning Lecture 3: More on regularization. Bayesian vs maximum likelihood learning L2 and L1 regularization for linear estimators A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting

More information

Seminar über Statistik FS2008: Model Selection

Seminar über Statistik FS2008: Model Selection Seminar über Statistik FS2008: Model Selection Alessia Fenaroli, Ghazale Jazayeri Monday, April 2, 2008 Introduction Model Choice deals with the comparison of models and the selection of a model. It can

More information

Introduction to Bayesian Statistics

Introduction to Bayesian Statistics Introduction to Bayesian Statistics Dimitris Fouskakis Dept. of Applied Mathematics National Technical University of Athens Greece fouskakis@math.ntua.gr M.Sc. Applied Mathematics, NTUA, 2014 p.1/104 Thomas

More information

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition. Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface

More information

P Values and Nuisance Parameters

P Values and Nuisance Parameters P Values and Nuisance Parameters Luc Demortier The Rockefeller University PHYSTAT-LHC Workshop on Statistical Issues for LHC Physics CERN, Geneva, June 27 29, 2007 Definition and interpretation of p values;

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

Lecture 12 November 3

Lecture 12 November 3 STATS 300A: Theory of Statistics Fall 2015 Lecture 12 November 3 Lecturer: Lester Mackey Scribe: Jae Hyuck Park, Christian Fong Warning: These notes may contain factual and/or typographic errors. 12.1

More information

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:

More information

Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment

Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment Ben Shaby SAMSI August 3, 2010 Ben Shaby (SAMSI) OFS adjustment August 3, 2010 1 / 29 Outline 1 Introduction 2 Spatial

More information

Statistical Theory MT 2006 Problems 4: Solution sketches

Statistical Theory MT 2006 Problems 4: Solution sketches Statistical Theory MT 006 Problems 4: Solution sketches 1. Suppose that X has a Poisson distribution with unknown mean θ. Determine the conjugate prior, and associate posterior distribution, for θ. Determine

More information

Multinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is

Multinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is Multinomial Data The multinomial distribution is a generalization of the binomial for the situation in which each trial results in one and only one of several categories, as opposed to just two, as in

More information

Lecture 1: Introduction

Lecture 1: Introduction Principles of Statistics Part II - Michaelmas 208 Lecturer: Quentin Berthet Lecture : Introduction This course is concerned with presenting some of the mathematical principles of statistical theory. One

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over Point estimation Suppose we are interested in the value of a parameter θ, for example the unknown bias of a coin. We have already seen how one may use the Bayesian method to reason about θ; namely, we

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

STAT 830 Bayesian Estimation

STAT 830 Bayesian Estimation STAT 830 Bayesian Estimation Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Bayesian Estimation STAT 830 Fall 2011 1 / 23 Purposes of These

More information

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham NC 778-5 - Revised April,

More information

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018 Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from

More information

Lecture 1: Bayesian Framework Basics

Lecture 1: Bayesian Framework Basics Lecture 1: Bayesian Framework Basics Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de April 21, 2014 What is this course about? Building Bayesian machine learning models Performing the inference of

More information

MIT Spring 2016

MIT Spring 2016 Decision Theoretic Framework MIT 18.655 Dr. Kempthorne Spring 2016 1 Outline Decision Theoretic Framework 1 Decision Theoretic Framework 2 Decision Problems of Statistical Inference Estimation: estimating

More information

Bayes Testing and More

Bayes Testing and More Bayes Testing and More STA 732. Surya Tokdar Bayes testing The basic goal of testing is to provide a summary of evidence toward/against a hypothesis of the kind H 0 : θ Θ 0, for some scientifically important

More information

Department of Statistics

Department of Statistics Research Report Department of Statistics Research Report Department of Statistics No. 208:4 A Classroom Approach to the Construction of Bayesian Credible Intervals of a Poisson Mean No. 208:4 Per Gösta

More information

Chapter 5. Bayesian Statistics

Chapter 5. Bayesian Statistics Chapter 5. Bayesian Statistics Principles of Bayesian Statistics Anything unknown is given a probability distribution, representing degrees of belief [subjective probability]. Degrees of belief [subjective

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Bayesian methods in economics and finance

Bayesian methods in economics and finance 1/26 Bayesian methods in economics and finance Linear regression: Bayesian model selection and sparsity priors Linear Regression 2/26 Linear regression Model for relationship between (several) independent

More information

Mathematical Statistics

Mathematical Statistics Mathematical Statistics MAS 713 Chapter 8 Previous lecture: 1 Bayesian Inference 2 Decision theory 3 Bayesian Vs. Frequentist 4 Loss functions 5 Conjugate priors Any questions? Mathematical Statistics

More information

Invariant HPD credible sets and MAP estimators

Invariant HPD credible sets and MAP estimators Bayesian Analysis (007), Number 4, pp. 681 69 Invariant HPD credible sets and MAP estimators Pierre Druilhet and Jean-Michel Marin Abstract. MAP estimators and HPD credible sets are often criticized in

More information

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric?

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/

More information

More on nuisance parameters

More on nuisance parameters BS2 Statistical Inference, Lecture 3, Hilary Term 2009 January 30, 2009 Suppose that there is a minimal sufficient statistic T = t(x ) partitioned as T = (S, C) = (s(x ), c(x )) where: C1: the distribution

More information