9 Bayesian inference. 9.1 Subjective probability

Size: px
Start display at page:

Download "9 Bayesian inference. 9.1 Subjective probability"

Transcription

1 9 Bayesian inference Subjective probability This is probability regarded as degree of belief. A subjective probability of an event A is assessed as p if you are prepared to stake pm to win M and equally prepared to accept a stake of pm to win M. In other words the bet is fair and you are assumed to behave rationally Kolmogorov s axioms How does subjective probability fit in with the fundamental axioms? Let A be the set of all subsets of a countable sample space Ω. Then (i) P (A) 0 for every A A; (ii) P (Ω)=1; 83

2 (iii) If {A λ : λ Λ} is a countable set of mutually exclusive events belonging to A, then P ( λ Λ A λ ) = λ Λ P (A λ ). Obviously the subjective interpretation has no difficulty in conforming with (i) and (ii). (iii) is slightly less obvious. Suppose we have 2 events A and B such that A B =. Consider a stake of p A M to win M if A occurs and a stake p B M to win M if B occurs. The total stake for bets on A or B occurring is p A M+ p B M to win M if A or B occurs. Thus we have (p A + p B )M to win M and so P (A B) =P (A) +P (B) Conditional probability Define p B, p AB, p A B such that p B M is the fair stake for M if B occurs; p AB M is the fair stake for M if A and B occur; p A B M is the fair stake for M if A occurs given B has occurred otherwise the bet is off. Consider the 3 alternative outcomes A only, B only, A and B. IfG 1, G 2, G 3 are the gains, then A : p B M 2 p AB M 3 = G 1 B : p A B M 1 +(1 p B )M 2 p AB )M 3 = G 2 A B : (1 p A B )M 1 +(1 p B )M 2 +(1 p AB )M 3 = G 3 The principle of rationality does not allow the possibility of setting the stake to obtain sure profit. Thus or 0 p B p AB p A B 1 p B p AB 1 p A B 1 p B 1 p AB p AB p B p A B =0 P (A B) = P (A B). P (B) =0 84

3 9.2 Estimation formulated as a decision problem A The action space, which is the set of all possible actions a A available. Θ The parameter space, consisting of all possible states of nature, only one of which occurs or will occur ( this true state being unknown). L The loss function, having domain Θ A(the set of all possible consequences (θ, a)) and codomain R. R X Containing all possible realisations x R X of a random variable X belonging to the family {f(x; θ) :θ Θ}. D The decision space, consisting of all possible decisions δ D, each decision function having domain R X and codomain δ The basic idea The true state of the world is unknown when the action is chosen. The loss is known which results from each of the possible consequences (θ, a). Data are collected to provide information about the unknown θ. A choice of action is taken for each possible observation of X. Decision theory is concerned with the criteria for making a choice. The form of the loss function is crucial it depends upon the nature of the problem. Example You are a fashion buyer and must stock up with trendy dresses. The true number you will sell is θ (unknown). 85

4 You stock up with a dresses. If you overstock you must sell off at your endof-season sale and lose A per dress. If you understock you lose the profit of B per dress. Thus { A(a θ), a θ, L(θ, a) = B(θ a), a θ. Special forms of loss function If A = B, you have modular loss, L(θ, a) =A θ a To suit comparison with classical inference, quadratic loss is often used. L(θ, a) =C(θ a) The risk function This is a measure of the quality of a decision function. Suppose we choose a decision δ(x) (i.e. choose a = δ(x)). The function R with domain Θ Dand codomain R defined by is called the risk function. We may write this as R(θ, δ) = L (θ, δ(x)) f(x; θ)dx RX or R(θ, δ) = x RX L (θ, δ(x)) p(x; θ) R(θ, δ) =E [L (θ,δ(x))]. Example Suppose X is a random sample from N(θ,θ) and suppose we assume quadratic loss. 86

5 Consider We know that E(X) =θ, so But E [δ 2 (X)] = θ, sothat δ 1 (X) =X, δ 2 (X) = 1 n 1 n i=1 R(θ, δ 1 )=E [ (θ X) 2]. R(θ, δ 1 )=V (X) = θ n. R(θ, δ 2 )=E [ (θ δ 2 (X)) 2 ]. R(θ, δ 2 ) = V [δ 2 (X)] = [ θ 2 n (n V 1) 2 i=1 (X i X) 2. (X i X) 2 θ ] = R(θ,d) θ 2 2θ2 2(n 1) = 2 (n 1) n. 1 d 2 (n 1)/2n 2 d 1 (n 1)/2n Without knowing θ, how do we choose between δ 1 and δ 2? Two sensible criteria θ 1. Take precautions against the worst. 2. Take into account any further information you may have. 87

6 Criterion 1 leads to minimax. Choose the decision leading to the smallest maximum risk, i.e. choose δ Dsuch that sup {R(θ, δ )} = inf θ Θ sup δ D θ Θ {R(θ, δ)}. Minimax rules have disadvantages, the main one being that they treat Nature as an adversary. Minimax guards against the worst choice. The value of θ should not be regarded as a value deliberately chosen by Nature to make the risk large. Fortunately in many situations they turn out to be reasonable in practice, if not in concept. Criterion 2 involves introducing beliefs about θ into the analysis. If π(θ) represents our beliefs about θ, then define Bayes risk r(π, δ) as r(π, δ) = Now choose δ such that { Θ R(θ, δ)π(θ)dθ θ Θ R(θ, δ)π(θ) if θ is continuous, if θ is discrete. r(π, δ )=inf{r(π, δ)}. δ D Bayes Rule The Bayes rule with respect to a prior π is the decision rule δ that minimises the Bayes risk among all possible decision rules. Bayes tomb 88

7 9.2.3 Decision terminology δ is called a Bayes decision function. δ(x) is a Bayes estimator. δ(x) is a Bayes estimate. If a decision function δ is such that δ satisfying R(θ, δ ) R(θ, δ ) θ Θ and R(θ, δ ) <R(θ,δ ) for some θ Θ, then δ is dominated by δ and δ is said to be inadmissible. All other decisions are admissible. 89

8 9.3 Prior and posterior distributions The prior distribution represents our initial belief in the parameter θ. This means that θ is regarded as a random variable with p.d.f. π(θ). From Bayes Theorem, f(x θ)π(θ) =π(θ x)f(x) or f(x θ)π(θ) π(θ x) = f(x θ)π(θ)dθ. Θ π(θ x) is called the posterior p.d.f. It represents our updated belief in θ, having taken account of the data. We write the expression in the form π(θ x) f(x θ)π(θ) or Posterior Likelihood Prior. Example Suppose X 1,X 2,...,X n is a random sample from a Poisson distribution, Poi(θ), where θ has a prior distribution π(θ) = θα 1 e θ Γ(α) The posterior distribution is given by which simplifies to π(θ x) θ x i e nθ xi!, θ 0,α> 0. θα 1 e θ Γ(α) π(θ x) θ x i +α 1 e (n+1)θ, θ 0. This is recognisable as a Γ-distribution and, by comparison with π(θ), we can write π(θ (n x) = +1) xi +α x θ i+α 1 e (n+1)θ, θ 0, Γ( x i + α) which represents our posterior belief in θ. 90

9 9.4 Decisions We have already defined the Bayes risk to be where so that r(π,δ) = Θ R(θ,δ)π(θ)dθ R(θ, δ) = L (θ,δ(x)) f(x θ)dx R X r(π, δ) = Θ R X L (θ, δ(x)) f (x θ)dxπ(θ)dθ = R X Θ L (θ, δ(x)) π(θ x)dθf (x)dx For any given x, minimising the Bayes risk is equivalent to minimising Θ L (θ,δ(x)) π(θ x)dθ. i.e. we minimise the posterior expected loss. Example Take the posterior distribution in the previous example and, for no special reason, assume quadratic loss Then we minimise L (θ,δ(x))=[θ δ(x)] 2. Θ [θ δ(x)] 2 π(θ x)dθ by differentiating with respect to δ to obtain 2 Θ [θ δ(x)] π(θ x)dθ =0 δ(x) = Θ θπ(θ x)dθ. 91

10 This is the mean of the posterior distribution. δ(x) = 0 (n +1) xi +α x θ i +α e (n+1)θ dθ Γ( x i + α) = (n +1) xi +α Γ( x i + α +1) Γ( x i + α)(n +1) x i +α+1 = xi + α n +1 Theorem: Suppose that δ π is a decision rule that is Bayes with respect to some prior distribution π. If the risk function of δ π satisfies then δ π is a minimax decision rule. R(θ, δ π ) r(π, δ π ) for all θ Θ, Proof : Suppose δ π is not minimax. Then there is a decision rule δ such that sup θ Θ R(θ, δ ) < sup R(θ, δ π ). θ Θ For this decision rule we have, since, if f(x) k, then E (f (X)) k, r(π, δ ) sup R(θ, δ ) θ Θ R(θ, δ π ) < sup θ Θ r(π,δ π ), contradicting the statement that δ π is Bayes with respect to π. Hence δ π is minimax. Example Let X 1,...,X n be a random sample from a Bernoulli distribution B(1,p). Let Y = X i and consider estimating p using the loss function L(p, a) = 92 (p a)2 p(1 p),

11 a loss which penalises mistakes more if p is near 0 or 1. The m.l.e. is p = Y/n and ( ) ( p p) 2 R(p, p) = E p(1 p) 1 p(1 p) = = 1 p(1 p) n n The risk is constant and, therefore, minimax. [NB A decision rule with constant risk is called an equalizer rule] We now show that p is the Bayes estimator with respect to the U (0, 1) prior which will imply that p is minimax. π(θ y) p y (1 p) n y so we want the value of a which minimises 1 0 (p a) 2 Differentiation with respect to a yields p(1 p) py (1 p) n y dp. 0 a = py (1 p) n y 1 dp 1 0 py 1 (1 p) n y 1 dp. Now a β-distribution with parameters α, β has pdf so 1 f Γ(α (x) = + β)xα 1 (1 x) β 1, x [0, 1], Γ(α)Γ(β) a = Γ(y + 1)Γ(n y) Γ(n +1) and, using the identity Γ(α +1)=αΓ(α), Γ(n) Γ(y)Γ(n y) a = y n. i.e. p = Y/n. 93

12 9.5 Bayes estimates (i) The mean of π(θ x) is the Bayes estimate with respect to quadratic loss. Proof Choose δ(x) to minimise Θ [θ δ(x)] 2 π(θ x)dθ by differentiating with respect to δ and equating to zero. Then since Θ π(θ x)dθ =1. δ(x) = Θ θπ(θ x)dθ (ii) The median of π(θ x) is the Bayes estimate with respect to modular loss (i.e. absolute value loss). Proof Choose δ(x) to minimise = δ(x) Θ θ δ(x) π(θ x)dθ (δ(x) θ) π(θ x)dθ + δ(x) Differentiate with respect to δ and equate to zero. so that (θ δ(x)) π(θ x)dθ δ(x) π(θ x)dθ = π(θ x)dθ δ(x) δ(x) 2 π(θ x)dθ = π(θ x)dθ =1 δ(x) and δ(x) is the median of π(θ x). π(θ x)dθ = 1 2, 94

13 (iii) The mode of π(θ x) is the Bayes estimate with respect to zero-one loss (i.e. loss of zero for a correct estimate and loss of one for any incorrect eastimate). Proof The loss function has the form L (θ, δ(x)) = { 1, if δ(x) θ, 0, if δ(x) =θ. This is only a useful idea for a discrete distribution. θ Θ L (θ, δ(x)) π(θ x) = π(θ x) θ Θ\{θ } = 1 π(θ x) where δ(x) =θ. Minimisation means maximising π(θ x), soδ(x) is taken to be the most likely value the mode. Remember that π(θ x) represents our most up-to-date belief in θ and is all that is needed for inference. 95

14 9.6 Credible intervals Definition The interval (θ 1,θ 2 ) is a 100(1 α)% credible interval for θ if θ2 π(θ x)dθ =1 α, 0 α 1. θ 1 π(θ x) θ θ 1 θ 1 θ 2 θ 2 Both (θ 1,θ 2 ) and (θ 1,θ 2 ) are 100(1 α)% credible intervals. Definition The interval (θ 1,θ 2 ) is a 100(1 α)% highest posterior density interval if (i) (θ 1,θ 2 ) is a 100(1 α)% credible interval; (ii) θ (θ 1,θ 2 ) and θ / (θ 1,θ 2 ), π(θ x) π(θ x) Obviously this defines the shortest possible 100(1 α)% credible interval. 96

15 π(θ x) 1 α θ 1 θ 2 θ 97

16 9.7 Features of Bayesian models (i) If there is a sufficient statistic, say T, the posterior distribution will depend upon the data only through T. (ii) There is a natural parametric family of priors such that the posterior distributions also belong to this family. Such priors are called conjugate priors. Example: Binomial likelihood, beta prior A binomial likelihood has the form ( n ) p(x θ) = θ x (1 θ) n x, x x =0, 1,...,n, and a beta prior is where so that π(θ) = θa 1 (1 θ) b 1, θ [0, 1], B(a, b) B(a, b) = Γ(a)Γ(b) Γ(a. + b) Posterior Likelihood Prior, π(θ x) θ x+a 1 (1 θ) n x+b 1 and comparison with the prior leads to π(θ x) = θx+a 1 (1 θ) n x+b 1, θ [0, 1], B(x + a, n x + b) which is also a beta distribution. 98

17 9.8 Hypothesis testing The Bayesian only has a concept of a hypothesis test in certain very specific situations. This is because of the alien idea of having to regard a parameter as a fixed point. However the concept has some validity in an example such as the following. Example: Water samples are taken from a reservoir and checked for bacterial content. These provide information about θ, the proportion of infected samples. If it is believed that bacterial density is such that more than 1% of samples are infected, there is a legal obligation to change the chlorination procedure. Bayesian procedure is (i) use the data to obtain the posterior distribution of θ; (ii) calculate P (θ > 0.01). EC rules are formulated in statistical terms and, in such cases, allow for a 5% significance level. This could be interpreted in a Bayesian sense as allowing for a wrong decision with probability no more than 0.05, so order increased chlorination if P (θ > 0.01) > Where for some reason (e.g. legal reason) a fixed parameter value is of interest because it defines a threshold, then a Bayesian concept of a test is possible. It is usually performed either by calculating a credible bound or a credible interval and looking at it in relation to that threshold value or values, or by calculating posterior probabilities based on the threshold or thresholds. 99

18 9.9 Inferences for normal distributions Unknown mean, known variance Consider data from N(θ, σ 2 ) and a prior N(τ,κ 2 ). so, since and But Posterior Likelihood Prior. f(x θ,σ 2 )=(2πσ 2 ) n/2 exp π(θ) = π(θ x,σ 2,τ,κ 2 ) exp ( 1 2πκ exp 1 2 ( 1 2σ 2 (x i θ) 2 ) ) 2κ (θ τ 2 )2, θ R, ( n 2σ 2 (θ2 2θx) 1 2κ 2 (θ2 2θτ) n 2σ 2 (θ2 2θx) + 1 2κ 2 (θ2 2θτ) ( ) = 1 n 2 σ + 1 θ 2 2 κ 2 = 1 2 ( nx σ 2 + τ κ 2 ) θ + garbage ( n )( ) σ + 1 θ nx /σ2 + τ /κ κ 2 n /σ 2 +1/κ 2 so the posterior p.d.f. is proportional to [ ( 1 exp θ nx /σ2 + τ /κ 2 2(n /σ 2 +1/κ 2 ) 1 This means that θ x,σ 2,τ,κ 2 N The form is a weighted average. The prior mean τ has weight 1/κ 2 : this is 1/(prior variance),... n /σ 2 +1/κ 2 ) 2 ] ( nx /σ 2 + τ /κ 2 n /σ 2 +1/κ 2, (n / σ 2 +1 / κ 2 ) 1 ). 100 ).

19 and the observed sample mean has weight n /σ 2 : this is 1/(variance of sample mean). This is reasonable because ( nx /σ 2 + τ /κ 2 and lim n lim κ 2 n /σ 2 +1/κ 2 ) ( nx /σ 2 + τ /κ 2 n /σ 2 +1/κ 2 ) = x = x. The posterior mean tends to x as the amount of data swamps the prior or as prior beliefs become very vague Unknown variance, known mean We need to assign a prior p.d.f. π(σ 2 ). The Γ family of p.d.f. s provides a flexible set of shapes over [0, ). We take τ = 1 σ 2 Γ ( λ 2, νλ 2 where λ and ν are chosen to provide suitable location and shape. In other words, the prior p.d.f. of τ =1/σ 2 is chosen to be of the form ( ) π(τ) = (νλ)ν/2 ν/2 1 τ 2 ν/2 Γ(ν /2) exp νλτ, τ 0. 2 or π ( σ 2) = (νλ)ν/2 (σ 2 ) ν/2 1 τ =1/σ 2 is called the precision. The posterior p.d.f. is given by 2 ν/2 Γ(ν /2) ) exp ( νλ 2σ 2 ). π ( σ 2 x,θ ) f ( x θ, σ 2) π ( σ 2), and the right-hand side as a function of σ 2, is proportional to ( σ 2 ) n/2 exp ( 1 2σ 2 (x i θ) 2 ) ( σ 2) ν/2 1 exp ( νλ 2σ 2 ). 101

20 Thus π (σ 2 x,θ) is proportional to ( ) [ ]) σ 2 (ν+n)/2 1 exp ( 1 νλ + (x 2σ 2 i θ) 2 This is the same as the prior except that ν is replaced by ν + n and λ is replaced by νλ + (x i θ) 2. ν + n Consider the random variable W = νλ + (x i θ) 2 σ 2. Clearly f W (w) =Cw (ν+n)/2 1 e w/2, w 0. Thisisaχ 2 -distribution with ν + n degrees of freedom. In other words, νλ + (x i θ) 2 σ 2 χ 2 (ν + n). Again this agrees with intuition. As n we approach the classical estimate, and also as ν 0 which corresponds to vague prior knowledge Mean and variance unknown We now have to assign a joint prior p.d.f. π (θ, σ 2 ) to obtain a joint posterior p.d.f. π ( θ, σ 2 x ) f ( x θ, σ 2) π ( θ, σ 2). We shall look at the special case of a non-informative prior. From theoretical considerations beyond the scope of this course, a non-informative prior may be approximated by π ( θ, σ 2) 1 σ

21 Note that this is not to be thought of as an expression of prior beliefs but rather as an approximate prior which allows the posterior to be dominated by the data. π ( θ, σ 2 x ) ( σ 2) n/2 1 exp ( 1 2σ 2 (x i θ) 2 ). The right-hand side may be written in the form ( σ 2 ) n/2 1 exp ( 1 2σ 2 [ (x i x) 2 + n(x θ) 2 ]). We can use this form to integrate out, in turn, θ and σ 2. Using we find exp ( n ) 2σ (θ 2πσ 2 x)2 dθ 2 = n π ( σ 2 x ) ( σ 2) (n 1)/2 1 exp ( 1 2σ 2 (x i x) 2 ) and, putting W = (x i x) 2 /σ 2, f W (w) =Cw (n 1)/2 1 e w/2, w 0, so we see that (xi x) 2 σ 2 χ 2 (n 1). To integrate out σ 2, make the substitution σ 2 = τ 1. Then π (θ x) is proportional to 0 ( τ n/2 1 exp τ ]) [ (x i x) 2 + n(x θ) 2 dτ. 2 From the definition of the Γ-function, 0 y α 1 e βy dy = Γ(α) β α, and we see, since only the terms in θ are relevant, π (θ x) ( (x i x) 2 + n(x θ) 2 ) n/2. 103

22 or where π (θ x) t = (1 + t2 n 1 ) n/2 n(θ x). (xi x) 2 /(n 1) 104

A Very Brief Summary of Bayesian Inference, and Examples

A Very Brief Summary of Bayesian Inference, and Examples A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X

More information

Chapter 5. Bayesian Statistics

Chapter 5. Bayesian Statistics Chapter 5. Bayesian Statistics Principles of Bayesian Statistics Anything unknown is given a probability distribution, representing degrees of belief [subjective probability]. Degrees of belief [subjective

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 20 Lecture 6 : Bayesian Inference

More information

Econ 2148, spring 2019 Statistical decision theory

Econ 2148, spring 2019 Statistical decision theory Econ 2148, spring 2019 Statistical decision theory Maximilian Kasy Department of Economics, Harvard University 1 / 53 Takeaways for this part of class 1. A general framework to think about what makes a

More information

STA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources

STA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources STA 732: Inference Notes 10. Parameter Estimation from a Decision Theoretic Angle Other resources 1 Statistical rules, loss and risk We saw that a major focus of classical statistics is comparing various

More information

Part III. A Decision-Theoretic Approach and Bayesian testing

Part III. A Decision-Theoretic Approach and Bayesian testing Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to

More information

Statistical Theory MT 2006 Problems 4: Solution sketches

Statistical Theory MT 2006 Problems 4: Solution sketches Statistical Theory MT 006 Problems 4: Solution sketches 1. Suppose that X has a Poisson distribution with unknown mean θ. Determine the conjugate prior, and associate posterior distribution, for θ. Determine

More information

Lecture 7 October 13

Lecture 7 October 13 STATS 300A: Theory of Statistics Fall 2015 Lecture 7 October 13 Lecturer: Lester Mackey Scribe: Jing Miao and Xiuyuan Lu 7.1 Recap So far, we have investigated various criteria for optimal inference. We

More information

Statistical Approaches to Learning and Discovery. Week 4: Decision Theory and Risk Minimization. February 3, 2003

Statistical Approaches to Learning and Discovery. Week 4: Decision Theory and Risk Minimization. February 3, 2003 Statistical Approaches to Learning and Discovery Week 4: Decision Theory and Risk Minimization February 3, 2003 Recall From Last Time Bayesian expected loss is ρ(π, a) = E π [L(θ, a)] = L(θ, a) df π (θ)

More information

Statistical Theory MT 2007 Problems 4: Solution sketches

Statistical Theory MT 2007 Problems 4: Solution sketches Statistical Theory MT 007 Problems 4: Solution sketches 1. Consider a 1-parameter exponential family model with density f(x θ) = f(x)g(θ)exp{cφ(θ)h(x)}, x X. Suppose that the prior distribution has the

More information

Lecture notes on statistical decision theory Econ 2110, fall 2013

Lecture notes on statistical decision theory Econ 2110, fall 2013 Lecture notes on statistical decision theory Econ 2110, fall 2013 Maximilian Kasy March 10, 2014 These lecture notes are roughly based on Robert, C. (2007). The Bayesian choice: from decision-theoretic

More information

David Giles Bayesian Econometrics

David Giles Bayesian Econometrics David Giles Bayesian Econometrics 1. General Background 2. Constructing Prior Distributions 3. Properties of Bayes Estimators and Tests 4. Bayesian Analysis of the Multiple Regression Model 5. Bayesian

More information

INTRODUCTION TO BAYESIAN METHODS II

INTRODUCTION TO BAYESIAN METHODS II INTRODUCTION TO BAYESIAN METHODS II Abstract. We will revisit point estimation and hypothesis testing from the Bayesian perspective.. Bayes estimators Let X = (X,..., X n ) be a random sample from the

More information

MIT Spring 2016

MIT Spring 2016 Decision Theoretic Framework MIT 18.655 Dr. Kempthorne Spring 2016 1 Outline Decision Theoretic Framework 1 Decision Theoretic Framework 2 Decision Problems of Statistical Inference Estimation: estimating

More information

Econ 2140, spring 2018, Part IIa Statistical Decision Theory

Econ 2140, spring 2018, Part IIa Statistical Decision Theory Econ 2140, spring 2018, Part IIa Maximilian Kasy Department of Economics, Harvard University 1 / 35 Examples of decision problems Decide whether or not the hypothesis of no racial discrimination in job

More information

Lecture 1: Bayesian Framework Basics

Lecture 1: Bayesian Framework Basics Lecture 1: Bayesian Framework Basics Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de April 21, 2014 What is this course about? Building Bayesian machine learning models Performing the inference of

More information

Bayesian statistics: Inference and decision theory

Bayesian statistics: Inference and decision theory Bayesian statistics: Inference and decision theory Patric Müller und Francesco Antognini Seminar über Statistik FS 28 3.3.28 Contents 1 Introduction and basic definitions 2 2 Bayes Method 4 3 Two optimalities:

More information

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b) LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered

More information

STAT 830 Bayesian Estimation

STAT 830 Bayesian Estimation STAT 830 Bayesian Estimation Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Bayesian Estimation STAT 830 Fall 2011 1 / 23 Purposes of These

More information

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1 Chapter 4 HOMEWORK ASSIGNMENTS These homeworks may be modified as the semester progresses. It is your responsibility to keep up to date with the correctly assigned homeworks. There may be some errors in

More information

Lecture 2: Statistical Decision Theory (Part I)

Lecture 2: Statistical Decision Theory (Part I) Lecture 2: Statistical Decision Theory (Part I) Hao Helen Zhang Hao Helen Zhang Lecture 2: Statistical Decision Theory (Part I) 1 / 35 Outline of This Note Part I: Statistics Decision Theory (from Statistical

More information

4 Invariant Statistical Decision Problems

4 Invariant Statistical Decision Problems 4 Invariant Statistical Decision Problems 4.1 Invariant decision problems Let G be a group of measurable transformations from the sample space X into itself. The group operation is composition. Note that

More information

Introduction to Bayesian Methods. Introduction to Bayesian Methods p.1/??

Introduction to Bayesian Methods. Introduction to Bayesian Methods p.1/?? to Bayesian Methods Introduction to Bayesian Methods p.1/?? We develop the Bayesian paradigm for parametric inference. To this end, suppose we conduct (or wish to design) a study, in which the parameter

More information

MIT Spring 2016

MIT Spring 2016 MIT 18.655 Dr. Kempthorne Spring 2016 1 MIT 18.655 Outline 1 2 MIT 18.655 Decision Problem: Basic Components P = {P θ : θ Θ} : parametric model. Θ = {θ}: Parameter space. A{a} : Action space. L(θ, a) :

More information

General Bayesian Inference I

General Bayesian Inference I General Bayesian Inference I Outline: Basic concepts, One-parameter models, Noninformative priors. Reading: Chapters 10 and 11 in Kay-I. (Occasional) Simplified Notation. When there is no potential for

More information

Introduction to Bayesian Methods

Introduction to Bayesian Methods Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

The binomial model. Assume a uniform prior distribution on p(θ). Write the pdf for this distribution.

The binomial model. Assume a uniform prior distribution on p(θ). Write the pdf for this distribution. The binomial model Example. After suspicious performance in the weekly soccer match, 37 mathematical sciences students, staff, and faculty were tested for the use of performance enhancing analytics. Let

More information

Principles of Statistics

Principles of Statistics Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 81 Paper 4, Section II 28K Let g : R R be an unknown function, twice continuously differentiable with g (x) M for

More information

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:

More information

Qualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf

Qualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part : Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 143 Part IV

More information

Bayesian Statistics Part III: Building Bayes Theorem Part IV: Prior Specification

Bayesian Statistics Part III: Building Bayes Theorem Part IV: Prior Specification Bayesian Statistics Part III: Building Bayes Theorem Part IV: Prior Specification Michael Anderson, PhD Hélène Carabin, DVM, PhD Department of Biostatistics and Epidemiology The University of Oklahoma

More information

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation. PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.. Beta Distribution We ll start by learning about the Beta distribution, since we end up using

More information

Parameter Estimation

Parameter Estimation Parameter Estimation Chapters 13-15 Stat 477 - Loss Models Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 1 / 23 Methods for parameter estimation Methods for parameter estimation Methods

More information

Chapter 1. Sets and probability. 1.3 Probability space

Chapter 1. Sets and probability. 1.3 Probability space Random processes - Chapter 1. Sets and probability 1 Random processes Chapter 1. Sets and probability 1.3 Probability space 1.3 Probability space Random processes - Chapter 1. Sets and probability 2 Probability

More information

McGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper

McGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper McGill University Faculty of Science Department of Mathematics and Statistics Part A Examination Statistics: Theory Paper Date: 10th May 2015 Instructions Time: 1pm-5pm Answer only two questions from Section

More information

MIT Spring 2016

MIT Spring 2016 MIT 18.655 Dr. Kempthorne Spring 2016 1 MIT 18.655 Outline 1 2 MIT 18.655 3 Decision Problem: Basic Components P = {P θ : θ Θ} : parametric model. Θ = {θ}: Parameter space. A{a} : Action space. L(θ, a)

More information

Data Analysis and Uncertainty Part 2: Estimation

Data Analysis and Uncertainty Part 2: Estimation Data Analysis and Uncertainty Part 2: Estimation Instructor: Sargur N. University at Buffalo The State University of New York srihari@cedar.buffalo.edu 1 Topics in Estimation 1. Estimation 2. Desirable

More information

Statistical Machine Learning Lecture 1: Motivation

Statistical Machine Learning Lecture 1: Motivation 1 / 65 Statistical Machine Learning Lecture 1: Motivation Melih Kandemir Özyeğin University, İstanbul, Turkey 2 / 65 What is this course about? Using the science of statistics to build machine learning

More information

Lecture 8 October Bayes Estimators and Average Risk Optimality

Lecture 8 October Bayes Estimators and Average Risk Optimality STATS 300A: Theory of Statistics Fall 205 Lecture 8 October 5 Lecturer: Lester Mackey Scribe: Hongseok Namkoong, Phan Minh Nguyen Warning: These notes may contain factual and/or typographic errors. 8.

More information

Chapter 1. Statistical Spaces

Chapter 1. Statistical Spaces Chapter 1 Statistical Spaces Mathematical statistics is a science that studies the statistical regularity of random phenomena, essentially by some observation values of random variable (r.v.) X. Sometimes

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric?

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Jonathan Marchini Department of Statistics University of Oxford MT 2013 Jonathan Marchini (University of Oxford) BS2a MT 2013 1 / 27 Course arrangements Lectures M.2

More information

Beta statistics. Keywords. Bayes theorem. Bayes rule

Beta statistics. Keywords. Bayes theorem. Bayes rule Keywords Beta statistics Tommy Norberg tommy@chalmers.se Mathematical Sciences Chalmers University of Technology Gothenburg, SWEDEN Bayes s formula Prior density Likelihood Posterior density Conjugate

More information

Unobservable Parameter. Observed Random Sample. Calculate Posterior. Choosing Prior. Conjugate prior. population proportion, p prior:

Unobservable Parameter. Observed Random Sample. Calculate Posterior. Choosing Prior. Conjugate prior. population proportion, p prior: Pi Priors Unobservable Parameter population proportion, p prior: π ( p) Conjugate prior π ( p) ~ Beta( a, b) same PDF family exponential family only Posterior π ( p y) ~ Beta( a + y, b + n y) Observed

More information

ST5215: Advanced Statistical Theory

ST5215: Advanced Statistical Theory Department of Statistics & Applied Probability Wednesday, October 5, 2011 Lecture 13: Basic elements and notions in decision theory Basic elements X : a sample from a population P P Decision: an action

More information

Lecture 2: Basic Concepts of Statistical Decision Theory

Lecture 2: Basic Concepts of Statistical Decision Theory EE378A Statistical Signal Processing Lecture 2-03/31/2016 Lecture 2: Basic Concepts of Statistical Decision Theory Lecturer: Jiantao Jiao, Tsachy Weissman Scribe: John Miller and Aran Nayebi In this lecture

More information

Chapter 9: Interval Estimation and Confidence Sets Lecture 16: Confidence sets and credible sets

Chapter 9: Interval Estimation and Confidence Sets Lecture 16: Confidence sets and credible sets Chapter 9: Interval Estimation and Confidence Sets Lecture 16: Confidence sets and credible sets Confidence sets We consider a sample X from a population indexed by θ Θ R k. We are interested in ϑ, a vector-valued

More information

Lecture 2. (See Exercise 7.22, 7.23, 7.24 in Casella & Berger)

Lecture 2. (See Exercise 7.22, 7.23, 7.24 in Casella & Berger) 8 HENRIK HULT Lecture 2 3. Some common distributions in classical and Bayesian statistics 3.1. Conjugate prior distributions. In the Bayesian setting it is important to compute posterior distributions.

More information

Probabilistic and Bayesian Machine Learning

Probabilistic and Bayesian Machine Learning Probabilistic and Bayesian Machine Learning Lecture 1: Introduction to Probabilistic Modelling Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London Why a

More information

Statistical Inference

Statistical Inference Statistical Inference Classical and Bayesian Methods Revision Class for Midterm Exam AMS-UCSC Th Feb 9, 2012 Winter 2012. Session 1 (Revision Class) AMS-132/206 Th Feb 9, 2012 1 / 23 Topics Topics We will

More information

On the Bayesianity of Pereira-Stern tests

On the Bayesianity of Pereira-Stern tests Sociedad de Estadística e Investigación Operativa Test (2001) Vol. 10, No. 2, pp. 000 000 On the Bayesianity of Pereira-Stern tests M. Regina Madruga Departamento de Estatística, Universidade Federal do

More information

Qualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf

Qualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 1: Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section

More information

Chapter 7. Basic Probability Theory

Chapter 7. Basic Probability Theory Chapter 7. Basic Probability Theory I-Liang Chern October 20, 2016 1 / 49 What s kind of matrices satisfying RIP Random matrices with iid Gaussian entries iid Bernoulli entries (+/ 1) iid subgaussian entries

More information

Bayesian statistics, simulation and software

Bayesian statistics, simulation and software Module 10: Bayesian prediction and model checking Department of Mathematical Sciences Aalborg University 1/15 Prior predictions Suppose we want to predict future data x without observing any data x. Assume:

More information

Lecture 1: Introduction

Lecture 1: Introduction Principles of Statistics Part II - Michaelmas 208 Lecturer: Quentin Berthet Lecture : Introduction This course is concerned with presenting some of the mathematical principles of statistical theory. One

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 00 MODULE : Statistical Inference Time Allowed: Three Hours Candidates should answer FIVE questions. All questions carry equal marks. The

More information

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist

More information

Introduction to Bayesian Statistics 1

Introduction to Bayesian Statistics 1 Introduction to Bayesian Statistics 1 STA 442/2101 Fall 2018 1 This slide show is an open-source document. See last slide for copyright information. 1 / 42 Thomas Bayes (1701-1761) Image from the Wikipedia

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

Discrete Binary Distributions

Discrete Binary Distributions Discrete Binary Distributions Carl Edward Rasmussen November th, 26 Carl Edward Rasmussen Discrete Binary Distributions November th, 26 / 5 Key concepts Bernoulli: probabilities over binary variables Binomial:

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

HPD Intervals / Regions

HPD Intervals / Regions HPD Intervals / Regions The HPD region will be an interval when the posterior is unimodal. If the posterior is multimodal, the HPD region might be a discontiguous set. Picture: The set {θ : θ (1.5, 3.9)

More information

Bayesian Inference. STA 121: Regression Analysis Artin Armagan

Bayesian Inference. STA 121: Regression Analysis Artin Armagan Bayesian Inference STA 121: Regression Analysis Artin Armagan Bayes Rule...s! Reverend Thomas Bayes Posterior Prior p(θ y) = p(y θ)p(θ)/p(y) Likelihood - Sampling Distribution Normalizing Constant: p(y

More information

Probability. Lecture Notes. Adolfo J. Rumbos

Probability. Lecture Notes. Adolfo J. Rumbos Probability Lecture Notes Adolfo J. Rumbos October 20, 204 2 Contents Introduction 5. An example from statistical inference................ 5 2 Probability Spaces 9 2. Sample Spaces and σ fields.....................

More information

Chapter 4. Bayesian inference. 4.1 Estimation. Point estimates. Interval estimates

Chapter 4. Bayesian inference. 4.1 Estimation. Point estimates. Interval estimates Chapter 4 Bayesian inference The posterior distribution π(θ x) summarises all our information about θ to date. However, sometimes it is helpful to reduce this distribution to a few key summary measures.

More information

The Bayesian Paradigm

The Bayesian Paradigm Stat 200 The Bayesian Paradigm Friday March 2nd The Bayesian Paradigm can be seen in some ways as an extra step in the modelling world just as parametric modelling is. We have seen how we could use probabilistic

More information

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Lecture 25: Review. Statistics 104. April 23, Colin Rundel Lecture 25: Review Statistics 104 Colin Rundel April 23, 2012 Joint CDF F (x, y) = P [X x, Y y] = P [(X, Y ) lies south-west of the point (x, y)] Y (x,y) X Statistics 104 (Colin Rundel) Lecture 25 April

More information

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes Neyman-Pearson paradigm. Suppose that a researcher is interested in whether the new drug works. The process of determining whether the outcome of the experiment points to yes or no is called hypothesis

More information

Peter Hoff Minimax estimation November 12, Motivation and definition. 2 Least favorable prior 3. 3 Least favorable prior sequence 11

Peter Hoff Minimax estimation November 12, Motivation and definition. 2 Least favorable prior 3. 3 Least favorable prior sequence 11 Contents 1 Motivation and definition 1 2 Least favorable prior 3 3 Least favorable prior sequence 11 4 Nonparametric problems 15 5 Minimax and admissibility 18 6 Superefficiency and sparsity 19 Most of

More information

STAT215: Solutions for Homework 2

STAT215: Solutions for Homework 2 STAT25: Solutions for Homework 2 Due: Wednesday, Feb 4. (0 pt) Suppose we take one observation, X, from the discrete distribution, x 2 0 2 Pr(X x θ) ( θ)/4 θ/2 /2 (3 θ)/2 θ/4, 0 θ Find an unbiased estimator

More information

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003 Statistics Ph.D. Qualifying Exam: Part I October 18, 2003 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your answer

More information

Bayesian Inference: Posterior Intervals

Bayesian Inference: Posterior Intervals Bayesian Inference: Posterior Intervals Simple values like the posterior mean E[θ X] and posterior variance var[θ X] can be useful in learning about θ. Quantiles of π(θ X) (especially the posterior median)

More information

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution. Hypothesis Testing Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution. Suppose the family of population distributions is indexed

More information

Inference for a Population Proportion

Inference for a Population Proportion Al Nosedal. University of Toronto. November 11, 2015 Statistical inference is drawing conclusions about an entire population based on data in a sample drawn from that population. From both frequentist

More information

CSC321 Lecture 18: Learning Probabilistic Models

CSC321 Lecture 18: Learning Probabilistic Models CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling

More information

(3) Review of Probability. ST440/540: Applied Bayesian Statistics

(3) Review of Probability. ST440/540: Applied Bayesian Statistics Review of probability The crux of Bayesian statistics is to compute the posterior distribution, i.e., the uncertainty distribution of the parameters (θ) after observing the data (Y) This is the conditional

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

2016 SISG Module 17: Bayesian Statistics for Genetics Lecture 3: Binomial Sampling

2016 SISG Module 17: Bayesian Statistics for Genetics Lecture 3: Binomial Sampling 2016 SISG Module 17: Bayesian Statistics for Genetics Lecture 3: Binomial Sampling Jon Wakefield Departments of Statistics and Biostatistics University of Washington Outline Introduction and Motivating

More information

Bayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017

Bayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017 Bayesian inference Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 10, 2017 1 / 22 Outline for today A genetic example Bayes theorem Examples Priors Posterior summaries

More information

Lecture 17: The Exponential and Some Related Distributions

Lecture 17: The Exponential and Some Related Distributions Lecture 7: The Exponential and Some Related Distributions. Definition Definition: A continuous random variable X is said to have the exponential distribution with parameter if the density of X is e x if

More information

10. Exchangeability and hierarchical models Objective. Recommended reading

10. Exchangeability and hierarchical models Objective. Recommended reading 10. Exchangeability and hierarchical models Objective Introduce exchangeability and its relation to Bayesian hierarchical models. Show how to fit such models using fully and empirical Bayesian methods.

More information

Linear Models A linear model is defined by the expression

Linear Models A linear model is defined by the expression Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose

More information

MAS3301 Bayesian Statistics

MAS3301 Bayesian Statistics MAS331 Bayesian Statistics M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 28-9 1 9 Conjugate Priors II: More uses of the beta distribution 9.1 Geometric observations 9.1.1

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 32 Lecture 14 : Variational Bayes

More information

Lecture 3. Univariate Bayesian inference: conjugate analysis

Lecture 3. Univariate Bayesian inference: conjugate analysis Summary Lecture 3. Univariate Bayesian inference: conjugate analysis 1. Posterior predictive distributions 2. Conjugate analysis for proportions 3. Posterior predictions for proportions 4. Conjugate analysis

More information

Final Examination a. STA 532: Statistical Inference. Wednesday, 2015 Apr 29, 7:00 10:00pm. Thisisaclosed bookexam books&phonesonthefloor.

Final Examination a. STA 532: Statistical Inference. Wednesday, 2015 Apr 29, 7:00 10:00pm. Thisisaclosed bookexam books&phonesonthefloor. Final Examination a STA 532: Statistical Inference Wednesday, 2015 Apr 29, 7:00 10:00pm Thisisaclosed bookexam books&phonesonthefloor Youmayuseacalculatorandtwo pagesofyourownnotes Do not share calculators

More information

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com 1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com Gujarati D. Basic Econometrics, Appendix

More information

Bayesian inference: what it means and why we care

Bayesian inference: what it means and why we care Bayesian inference: what it means and why we care Robin J. Ryder Centre de Recherche en Mathématiques de la Décision Université Paris-Dauphine 6 November 2017 Mathematical Coffees Robin Ryder (Dauphine)

More information

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain 0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide

More information

Bayesian Inference. Chapter 2: Conjugate models

Bayesian Inference. Chapter 2: Conjugate models Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

Fundamental Theory of Statistical Inference

Fundamental Theory of Statistical Inference Fundamental Theory of Statistical Inference G. A. Young January 2012 Preface These notes cover the essential material of the LTCC course Fundamental Theory of Statistical Inference. They are extracted

More information

Bayesian Statistics. Debdeep Pati Florida State University. February 11, 2016

Bayesian Statistics. Debdeep Pati Florida State University. February 11, 2016 Bayesian Statistics Debdeep Pati Florida State University February 11, 2016 Historical Background Historical Background Historical Background Brief History of Bayesian Statistics 1764-1838: called probability

More information

Interval Estimation. Chapter 9

Interval Estimation. Chapter 9 Chapter 9 Interval Estimation 9.1 Introduction Definition 9.1.1 An interval estimate of a real-values parameter θ is any pair of functions, L(x 1,..., x n ) and U(x 1,..., x n ), of a sample that satisfy

More information