Module 22: Bayesian Methods Lecture 9 A: Default prior selection
|
|
- Moses Baker
- 6 years ago
- Views:
Transcription
1 Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington
2 Outline Jeffreys prior Unit information priors Empirical Bayes priors
3 Independent binary sequence Suppose researcher A has data of the following type: M A : y 1,..., y n i.i.d. binary(θ), θ [0, 1]. A asks you to do a Bayesian analysis, but either doesn t have any prior information about θ, or wants you to obtain objective Bayesian inference for θ. You need to come up with some prior π A (θ) to use for this analysis.
4 Independent binary sequence Suppose researcher B has data of the following type: M B : y 1,..., y n i.i.d. binary( eγ 1+e γ ), γ (, ). B asks you to do a Bayesian analysis, but either doesn t have any prior information about γ, or wants you to obtain objective Bayesian inference for γ. You need to come up with some prior π B (γ) to use for this analysis.
5 Prior generating procedures Suppose we have a procedure for generating priors from models: Procedure(M) π Applying the procedure to model M A should generate a prior for θ: Procedure(M A ) π A (θ) Applying the procedure to model M B should generate a prior for γ: Procedure(M B ) π B (γ) What should the relationship between π A and π B be?
6 Induced priors Note that a prior π A (θ) over θ induces a prior π A (γ) over γ = log This induced prior can be obtained via calculus; simulation. θ 1 θ.
7 Induced priors t h e t a < r b e t a ( , 1, 1 ) gamma< l o g ( t h e t a /(1 t h e t a ) ) θ γ
8 Internally consistent procedures This fact creates a small conundrum: We could generate a prior for γ via the induced prior on θ: Procedure(M A ) π A (θ) π A (γ) Alternatively, a prior for γ could be obtained directly from M B : Procedure(M B ) π B (γ) Both π A (γ) and π B (γ) are obtained from the Procedure. Which one should we use?
9 Jeffreys principle Jeffreys (1949) says that any default Procedure should be internally consistent in the sense that the two priors on γ should be the same. More generally, his principle states if M B is a reparameterization of M A, then π A (γ) = π B (γ). Of course, all of this logic applies to the model in terms of θ: Procedure(M A ) π A (θ) Procedure(M B ) π B (γ) π B (θ) π A (θ) = π B (θ)
10 Jeffreys prior It turns out that Jeffreys principle leads to a unique Procedure: π J (θ) = E[( d log dθ p(y θ))2 ] Example: Binomial/binary model y 1,..., y n i.i.d. binary(θ) π J (θ) θ 1/2 (1 θ) 1/2 We recognize this prior as a beta(1/2,1/2) distribution: θ beta(1/2, 1/2) Default Bayesian inference is then based on the following posterior: θ y 1,..., y n beta(1/2 + y i, 1/2 + (1 y i )).
11 Jeffreys prior Example: Poisson model y 1,..., y n i.i.d. Poisson(θ) π J (θ) 1/ θ Recall our conjugate prior for θ in this case was a gamma(a, b) density: θ(θ a, b) θ a 1 e θ/b For the Poisson model and gamma prior, π gamma(a, b) θ y 1,..., y n gamma(a + y i, b + n) What about under the Jeffreys prior? π J (θ) looks like a gamma distribution with (a, b) = (1/2, 0). It follows that θ π J θ y 1,..., y n gamma(1/2 + y i, n). ( Note: π J is not an actual gamma density - it is not a probability density at all! )
12 Jeffreys prior Example: Normal model y 1,..., y n i.i.d. Normal(µ, σ 2 ) π J (µ, σ 2 ) = 1/σ 2 (this is a particular version of Jeffreys prior for multiparameter problems) It is very interesting to note that the resulting posterior for µ is µ ȳ s/ n tn 1 This means that a 95% objective Bayesian confidence interval for µ is µ ȳ ± t.975,n 1s/ n This is exactly the same as the usual t-confidence interval for a normal mean.
13 Notes on Jeffreys prior 1. Jeffreys principle leads to Jeffreys prior. 2. Jeffreys prior isn t always a proper prior distribution. 3. Improper priors can lead to proper posteriors. These often lead to Bayesian interpretations of frequentist procedures.
14 Data-based priors Recall from the binary/beta analysis: θ beta(a, b) y 1,..., y n binary(θ) θ y 1,..., y n beta(a + y i, b + (1 y i ) Under this posterior, E[θ y 1,..., y n] = a + y i a + b + n ( a + b = a + b + n a a+b guess at what θ is a + b confidence in guess. ) a ( a + b + n a + b + n ) ȳ
15 Data-based priors We may be reluctant to guess at what θ is. Wouldn t ȳ be better than a guess? Idea: Set a a+b = ȳ. Problem: This is cheating! Using ȳ for your prior misrepresents the amount of information you have. Solution: Cheat as little as possible: a Set = ȳ. a+b Set a + b = 1. This implies a = ȳ, b = 1 ȳ. The amount of cheating has the information content of only one observation.
16 Unit information principle If you don t have prior information about θ, then 1. Obtain an MLE/OLS estimator ˆθ of θ; 2. Make the prior π(θ) weakly centered around ˆθ, have the information equivalent of one observation. Again, such a prior leads to double-use of the information in your sample. However, the amount of cheating is small, and decreases with n.
17 Poisson example: y 1,..., y n i.i.d. Poisson(θ) Under the gamma(a, b) prior, E[θ y 1,..., y n] = a + y i b + n b = ( b + n ) a b + ( n b + n )ȳ Unit information prior: a/b = ȳ, b = 1 (a, b) = (ȳ, 1)
18 Comparison to Jeffreys prior CI width j u j u uj uj uj CI coverage probability j j u u j u j u j u n n
19 Notes on UI priors 1. UI priors weakly concentrate around a data-based estimator. 2. Inference under UI priors is anti-conservative, but this bias decreases with n. 3. Can be used in multiparameter settings, and is related to BIC.
20 Normal means problem Task: Estimate θ = (θ 1,..., θ p). An odd problem: y j = θ j + ɛ j, ɛ 1,..., ɛ p i.i.d. normal(0, 1) What does estimation of θ j have to do with estimation of θ k? There is only one observation y j per parameter θ j - how well can we do? Where the problem comes from: Comparison of two groups A and B on p variables (e.g. expression levels) For each variable j, construct a two-sample t-statistic y j = x A,j x B,j s j / n For each j, y j is approximately normal with mean θ j = n(µ A,j µ B,j )/σ j variance 1.
21 Normal means problem y j = θ j + ɛ j, ɛ 1,..., ɛ p i.i.d. normal(0, 1) One obvious estimator of θ = (θ 1,..., θ p) is y = (y 1,..., y p). y is the MLE; y is unbiased and the UMVUE. However, it turns out that y is not so great in terms of risk: R(y, θ) = E[ p (y j θ j ) 2 ] When p > 2 we can find an estimator that beats y for every value of θ, and is much better when p is large. This estimator has been referred to as an empirical Bayes estimator. j=1
22 Bayesian normal means problem y j = θ j + ɛ j, ɛ 1,..., ɛ p i.i.d. normal(0, 1) Consider the following prior on θ: θ 1,..., ɛ p i.i.d. normal(0, τ 2 ) Under this prior, ˆθ j = E[θ j y 1,..., y n] = τ 2 τ y j This is a type of shrinkage prior: It shrinks the estimates towards zero, away from y j ; It is particularly good if many of the true θ j s are very small or zero.
23 Empirical Bayes ˆθ j = τ 2 τ y j We might know we want to shrink towards zero. We might not know the appropriate amount of shrinkage. Solution: Estimate τ 2 from the data! y j = θ j + ɛ j ɛ j N(0, 1) θ j N(0, τ 2 y j N(0, τ 2 + 1) ) We should have y 2 j p(τ 2 + 1) y 2 j /p 1 τ 2 Idea: Use ˆτ 2 = y 2 j /p 1 for the shrinkage estimator. Modification Use ˆτ 2 = y 2 j /(p 2) 1 for the shrinkage estimator.
24 James-Stein estimation ˆθ j = ˆτ 2 ˆτ y j ˆτ 2 = y 2 j /(p 2) 1 It has been shown theoretically that from a non-bayesian perspective, this estimator beats y in terms of risk for all θ. R(ˆθ, θ) < R(y, θ) for all θ Also, from a Bayesian perspective, this estimator is almost as good as the optimal Bayes estimator, under a known τ 2.
25 Comparison of risks Bayes risk The Bayes risk of the JSE is between that of X and the Bayes estimator. Bayes risk functions are plotted for p {3, 5, 10, 20}. τ 2
26 Empirical Bayes in general Model: p(y θ), θ Θ Prior class: π(θ ψ), ψ Ψ What value of ψ to choose? Empirical Bayes: 1. Obtain the marginal likelihood p(y ψ) = p(y θ)π(θ ψ)dθ ; 2. Find an estimator ˆψ based on p(y ψ) ; 3. Use the prior π(θ ˆψ).
27 Notes on empirical Bayes 1. Empirical Bayes procedures are obtained by estimating hyperparameters from the data. 2. Often these procedures behave well from both Bayes and frequentist procedures. 3. They work best when the number of parameters is large and hyperparameters are distinguishable.
Module 4: Bayesian Methods Lecture 9 A: Default prior selection. Outline
Module 4: Bayesian Methods Lecture 9 A: Default prior selection Peter Ho Departments of Statistics and Biostatistics University of Washington Outline Je reys prior Unit information priors Empirical Bayes
More informationLecture 20 May 18, Empirical Bayes Interpretation [Efron & Morris 1973]
Stats 300C: Theory of Statistics Spring 2018 Lecture 20 May 18, 2018 Prof. Emmanuel Candes Scribe: Will Fithian and E. Candes 1 Outline 1. Stein s Phenomenon 2. Empirical Bayes Interpretation of James-Stein
More informationPart III. A Decision-Theoretic Approach and Bayesian testing
Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to
More informationIntroduction to Probabilistic Machine Learning
Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning
More informationA Very Brief Summary of Bayesian Inference, and Examples
A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X
More informationLecture 2: Statistical Decision Theory (Part I)
Lecture 2: Statistical Decision Theory (Part I) Hao Helen Zhang Hao Helen Zhang Lecture 2: Statistical Decision Theory (Part I) 1 / 35 Outline of This Note Part I: Statistics Decision Theory (from Statistical
More informationChapter 5. Bayesian Statistics
Chapter 5. Bayesian Statistics Principles of Bayesian Statistics Anything unknown is given a probability distribution, representing degrees of belief [subjective probability]. Degrees of belief [subjective
More informationPart 2: One-parameter models
Part 2: One-parameter models 1 Bernoulli/binomial models Return to iid Y 1,...,Y n Bin(1, ). The sampling model/likelihood is p(y 1,...,y n ) = P y i (1 ) n P y i When combined with a prior p( ), Bayes
More informationChapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1
Chapter 4 HOMEWORK ASSIGNMENTS These homeworks may be modified as the semester progresses. It is your responsibility to keep up to date with the correctly assigned homeworks. There may be some errors in
More informationSTAT 425: Introduction to Bayesian Analysis
STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall 2017 1 / 10 Lecture 7: Prior Types Subjective
More informationBayesian Inference. Chapter 2: Conjugate models
Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in
More informationNeutral Bayesian reference models for incidence rates of (rare) clinical events
Neutral Bayesian reference models for incidence rates of (rare) clinical events Jouni Kerman Statistical Methodology, Novartis Pharma AG, Basel BAYES2012, May 10, Aachen Outline Motivation why reference
More informationIntroduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models
Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Matthew S. Johnson New York ASA Chapter Workshop CUNY Graduate Center New York, NY hspace1in December 17, 2009 December
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationTime Series and Dynamic Models
Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The
More informationPart 4: Multi-parameter and normal models
Part 4: Multi-parameter and normal models 1 The normal model Perhaps the most useful (or utilized) probability model for data analysis is the normal distribution There are several reasons for this, e.g.,
More informationIntroduction to Bayesian Methods
Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationSTAT 830 Bayesian Estimation
STAT 830 Bayesian Estimation Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Bayesian Estimation STAT 830 Fall 2011 1 / 23 Purposes of These
More informationMinimum Message Length Analysis of the Behrens Fisher Problem
Analysis of the Behrens Fisher Problem Enes Makalic and Daniel F Schmidt Centre for MEGA Epidemiology The University of Melbourne Solomonoff 85th Memorial Conference, 2011 Outline Introduction 1 Introduction
More informationLecture 3. Univariate Bayesian inference: conjugate analysis
Summary Lecture 3. Univariate Bayesian inference: conjugate analysis 1. Posterior predictive distributions 2. Conjugate analysis for proportions 3. Posterior predictions for proportions 4. Conjugate analysis
More informationBayesian methods in economics and finance
1/26 Bayesian methods in economics and finance Linear regression: Bayesian model selection and sparsity priors Linear Regression 2/26 Linear regression Model for relationship between (several) independent
More informationg-priors for Linear Regression
Stat60: Bayesian Modeling and Inference Lecture Date: March 15, 010 g-priors for Linear Regression Lecturer: Michael I. Jordan Scribe: Andrew H. Chan 1 Linear regression and g-priors In the last lecture,
More informationSTAT215: Solutions for Homework 2
STAT25: Solutions for Homework 2 Due: Wednesday, Feb 4. (0 pt) Suppose we take one observation, X, from the discrete distribution, x 2 0 2 Pr(X x θ) ( θ)/4 θ/2 /2 (3 θ)/2 θ/4, 0 θ Find an unbiased estimator
More informationOverall Objective Priors
Overall Objective Priors Jim Berger, Jose Bernardo and Dongchu Sun Duke University, University of Valencia and University of Missouri Recent advances in statistical inference: theory and case studies University
More informationLinear Models A linear model is defined by the expression
Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose
More informationPARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.
PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.. Beta Distribution We ll start by learning about the Beta distribution, since we end up using
More informationLecture 13 Fundamentals of Bayesian Inference
Lecture 13 Fundamentals of Bayesian Inference Dennis Sun Stats 253 August 11, 2014 Outline of Lecture 1 Bayesian Models 2 Modeling Correlations Using Bayes 3 The Universal Algorithm 4 BUGS 5 Wrapping Up
More informationCarl N. Morris. University of Texas
EMPIRICAL BAYES: A FREQUENCY-BAYES COMPROMISE Carl N. Morris University of Texas Empirical Bayes research has expanded significantly since the ground-breaking paper (1956) of Herbert Robbins, and its province
More informationData Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Why uncertainty? Why should data mining care about uncertainty? We
More information(4) One-parameter models - Beta/binomial. ST440/550: Applied Bayesian Statistics
Estimating a proportion using the beta/binomial model A fundamental task in statistics is to estimate a proportion using a series of trials: What is the success probability of a new cancer treatment? What
More informationChoosing among models
Eco 515 Fall 2014 Chris Sims Choosing among models September 18, 2014 c 2014 by Christopher A. Sims. This document is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported
More informationOne Parameter Models
One Parameter Models p. 1/2 One Parameter Models September 22, 2010 Reading: Hoff Chapter 3 One Parameter Models p. 2/2 Highest Posterior Density Regions Find Θ 1 α = {θ : p(θ Y ) h α } such that P (θ
More informationDS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling
DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling Due: Tuesday, May 10, 2016, at 6pm (Submit via NYU Classes) Instructions: Your answers to the questions below, including
More informationBayesian Inference: Posterior Intervals
Bayesian Inference: Posterior Intervals Simple values like the posterior mean E[θ X] and posterior variance var[θ X] can be useful in learning about θ. Quantiles of π(θ X) (especially the posterior median)
More informationIntroduction to Bayesian Methods. Introduction to Bayesian Methods p.1/??
to Bayesian Methods Introduction to Bayesian Methods p.1/?? We develop the Bayesian paradigm for parametric inference. To this end, suppose we conduct (or wish to design) a study, in which the parameter
More information10. Exchangeability and hierarchical models Objective. Recommended reading
10. Exchangeability and hierarchical models Objective Introduce exchangeability and its relation to Bayesian hierarchical models. Show how to fit such models using fully and empirical Bayesian methods.
More informationMultinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is
Multinomial Data The multinomial distribution is a generalization of the binomial for the situation in which each trial results in one and only one of several categories, as opposed to just two, as in
More informationPredictive Distributions
Predictive Distributions October 6, 2010 Hoff Chapter 4 5 October 5, 2010 Prior Predictive Distribution Before we observe the data, what do we expect the distribution of observations to be? p(y i ) = p(y
More informationBeta statistics. Keywords. Bayes theorem. Bayes rule
Keywords Beta statistics Tommy Norberg tommy@chalmers.se Mathematical Sciences Chalmers University of Technology Gothenburg, SWEDEN Bayes s formula Prior density Likelihood Posterior density Conjugate
More informationFoundations of Statistical Inference
Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 20 Lecture 6 : Bayesian Inference
More informationAn Introduction to Bayesian Linear Regression
An Introduction to Bayesian Linear Regression APPM 5720: Bayesian Computation Fall 2018 A SIMPLE LINEAR MODEL Suppose that we observe explanatory variables x 1, x 2,..., x n and dependent variables y 1,
More informationOne-parameter models
One-parameter models Patrick Breheny January 22 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/17 Introduction Binomial data is not the only example in which Bayesian solutions can be worked
More informationHypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33
Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett
More informationStatistical Theory MT 2007 Problems 4: Solution sketches
Statistical Theory MT 007 Problems 4: Solution sketches 1. Consider a 1-parameter exponential family model with density f(x θ) = f(x)g(θ)exp{cφ(θ)h(x)}, x X. Suppose that the prior distribution has the
More informationThe binomial model. Assume a uniform prior distribution on p(θ). Write the pdf for this distribution.
The binomial model Example. After suspicious performance in the weekly soccer match, 37 mathematical sciences students, staff, and faculty were tested for the use of performance enhancing analytics. Let
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationWeakness of Beta priors (or conjugate priors in general) They can only represent a limited range of prior beliefs. For example... There are no bimodal beta distributions (except when the modes are at 0
More informationarxiv: v1 [stat.me] 25 Dec 2016
Adaptive multigroup confidence intervals with constant coverage Chaoyu Yu 1 and Peter D. Hoff 2 1 Department of Biostatistics, University of Washington-Seattle arxiv:1612.08287v1 [stat.me] 25 Dec 2016
More informationLecture 2: Priors and Conjugacy
Lecture 2: Priors and Conjugacy Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de May 6, 2014 Some nice courses Fred A. Hamprecht (Heidelberg U.) https://www.youtube.com/watch?v=j66rrnzzkow Michael I.
More informationSTA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources
STA 732: Inference Notes 10. Parameter Estimation from a Decision Theoretic Angle Other resources 1 Statistical rules, loss and risk We saw that a major focus of classical statistics is comparing various
More informationClassical and Bayesian inference
Classical and Bayesian inference AMS 132 January 18, 2018 Claudia Wehrhahn (UCSC) Classical and Bayesian inference January 18, 2018 1 / 9 Sampling from a Bernoulli Distribution Theorem (Beta-Bernoulli
More informationBernoulli and Poisson models
Bernoulli and Poisson models Bernoulli/binomial models Return to iid Y 1,...,Y n Bin(1, ). The sampling model/likelihood is p(y 1,...,y n ) = P y i (1 ) n P y i When combined with a prior p( ), Bayes rule
More informationBayesian Methods. David S. Rosenberg. New York University. March 20, 2018
Bayesian Methods David S. Rosenberg New York University March 20, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 March 20, 2018 1 / 38 Contents 1 Classical Statistics 2 Bayesian
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Empirical Bayes, Hierarchical Bayes Mark Schmidt University of British Columbia Winter 2017 Admin Assignment 5: Due April 10. Project description on Piazza. Final details coming
More information2018 SISG Module 20: Bayesian Statistics for Genetics Lecture 2: Review of Probability and Bayes Theorem
2018 SISG Module 20: Bayesian Statistics for Genetics Lecture 2: Review of Probability and Bayes Theorem Jon Wakefield Departments of Statistics and Biostatistics University of Washington Outline Introduction
More informationLECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)
LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered
More informationBayesian Statistics. Debdeep Pati Florida State University. February 11, 2016
Bayesian Statistics Debdeep Pati Florida State University February 11, 2016 Historical Background Historical Background Historical Background Brief History of Bayesian Statistics 1764-1838: called probability
More informationWeakness of Beta priors (or conjugate priors in general) They can only represent a limited range of prior beliefs. For example... There are no bimodal beta distributions (except when the modes are at 0
More informationMore on nuisance parameters
BS2 Statistical Inference, Lecture 3, Hilary Term 2009 January 30, 2009 Suppose that there is a minimal sufficient statistic T = t(x ) partitioned as T = (S, C) = (s(x ), c(x )) where: C1: the distribution
More informationSTAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01
STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist
More informationBayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007
Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.
More informationShrinkage Estimation in High Dimensions
Shrinkage Estimation in High Dimensions Pavan Srinath and Ramji Venkataramanan University of Cambridge ITA 206 / 20 The Estimation Problem θ R n is a vector of parameters, to be estimated from an observation
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More informationConjugate Analysis for the Linear Model
Conjugate Analysis for the Linear Model If we have good prior knowledge that can help us specify priors for β and σ 2, we can use conjugate priors. Following the procedure in Christensen, Johnson, Branscum,
More informationA union of Bayesian, frequentist and fiducial inferences by confidence distribution and artificial data sampling
A union of Bayesian, frequentist and fiducial inferences by confidence distribution and artificial data sampling Min-ge Xie Department of Statistics, Rutgers University Workshop on Higher-Order Asymptotics
More informationFinal Examination. STA 215: Statistical Inference. Saturday, 2001 May 5, 9:00am 12:00 noon
Final Examination Saturday, 2001 May 5, 9:00am 12:00 noon This is an open-book examination, but you may not share materials. A normal distribution table, a PMF/PDF handout, and a blank worksheet are attached
More informationRidge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation
Patrick Breheny February 8 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Large-scale testing is, of course, a big area and we could keep talking
More information5.2 Expounding on the Admissibility of Shrinkage Estimators
STAT 383C: Statistical Modeling I Fall 2015 Lecture 5 September 15 Lecturer: Purnamrita Sarkar Scribe: Ryan O Donnell Disclaimer: These scribe notes have been slightly proofread and may have typos etc
More informationFoundations of Statistical Inference
Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 32 Lecture 14 : Variational Bayes
More informationEstimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio
Estimation of reliability parameters from Experimental data (Parte 2) This lecture Life test (t 1,t 2,...,t n ) Estimate θ of f T t θ For example: λ of f T (t)= λe - λt Classical approach (frequentist
More informationDefault Priors and Effcient Posterior Computation in Bayesian
Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature
More informationChapter 9: Interval Estimation and Confidence Sets Lecture 16: Confidence sets and credible sets
Chapter 9: Interval Estimation and Confidence Sets Lecture 16: Confidence sets and credible sets Confidence sets We consider a sample X from a population indexed by θ Θ R k. We are interested in ϑ, a vector-valued
More informationDefault priors and model parametrization
1 / 16 Default priors and model parametrization Nancy Reid O-Bayes09, June 6, 2009 Don Fraser, Elisabeta Marras, Grace Yun-Yi 2 / 16 Well-calibrated priors model f (y; θ), F(y; θ); log-likelihood l(θ)
More informationBayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017
Bayesian inference Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 10, 2017 1 / 22 Outline for today A genetic example Bayes theorem Examples Priors Posterior summaries
More information1 Hypothesis Testing and Model Selection
A Short Course on Bayesian Inference (based on An Introduction to Bayesian Analysis: Theory and Methods by Ghosh, Delampady and Samanta) Module 6: From Chapter 6 of GDS 1 Hypothesis Testing and Model Selection
More informationStatistical Theory MT 2006 Problems 4: Solution sketches
Statistical Theory MT 006 Problems 4: Solution sketches 1. Suppose that X has a Poisson distribution with unknown mean θ. Determine the conjugate prior, and associate posterior distribution, for θ. Determine
More informationRemarks on Improper Ignorance Priors
As a limit of proper priors Remarks on Improper Ignorance Priors Two caveats relating to computations with improper priors, based on their relationship with finitely-additive, but not countably-additive
More informationBayesian Asymptotics
BS2 Statistical Inference, Lecture 8, Hilary Term 2008 May 7, 2008 The univariate case The multivariate case For large λ we have the approximation I = b a e λg(y) h(y) dy = e λg(y ) h(y ) 2π λg (y ) {
More informationBayesian Regression Linear and Logistic Regression
When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we
More informationChapter 8: Sampling distributions of estimators Sections
Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample variance Skip: p.
More information(1) Introduction to Bayesian statistics
Spring, 2018 A motivating example Student 1 will write down a number and then flip a coin If the flip is heads, they will honestly tell student 2 if the number is even or odd If the flip is tails, they
More informationCOS513 LECTURE 8 STATISTICAL CONCEPTS
COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions
More informationBayesian Inference. STA 121: Regression Analysis Artin Armagan
Bayesian Inference STA 121: Regression Analysis Artin Armagan Bayes Rule...s! Reverend Thomas Bayes Posterior Prior p(θ y) = p(y θ)p(θ)/p(y) Likelihood - Sampling Distribution Normalizing Constant: p(y
More informationIntroduction. Start with a probability distribution f(y θ) for the data. where η is a vector of hyperparameters
Introduction Start with a probability distribution f(y θ) for the data y = (y 1,...,y n ) given a vector of unknown parameters θ = (θ 1,...,θ K ), and add a prior distribution p(θ η), where η is a vector
More informationGibbs Sampling in Endogenous Variables Models
Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More informationBayesian Linear Models
Eric F. Lock UMN Division of Biostatistics, SPH elock@umn.edu 03/07/2018 Linear model For observations y 1,..., y n, the basic linear model is y i = x 1i β 1 +... + x pi β p + ɛ i, x 1i,..., x pi are predictors
More informationBayesian Inference: Concept and Practice
Inference: Concept and Practice fundamentals Johan A. Elkink School of Politics & International Relations University College Dublin 5 June 2017 1 2 3 Bayes theorem In order to estimate the parameters of
More informationModel Checking and Improvement
Model Checking and Improvement Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Model Checking All models are wrong but some models are useful George E. P. Box So far we have looked at a number
More informationMaximum Likelihood Estimation
Chapter 8 Maximum Likelihood Estimation 8. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function
More informationOther Noninformative Priors
Other Noninformative Priors Other methods for noninformative priors include Bernardo s reference prior, which seeks a prior that will maximize the discrepancy between the prior and the posterior and minimize
More informationBayesian model selection for computer model validation via mixture model estimation
Bayesian model selection for computer model validation via mixture model estimation Kaniav Kamary ATER, CNAM Joint work with É. Parent, P. Barbillon, M. Keller and N. Bousquet Outline Computer model validation
More informationECE531 Lecture 8: Non-Random Parameter Estimation
ECE531 Lecture 8: Non-Random Parameter Estimation D. Richard Brown III Worcester Polytechnic Institute 19-March-2009 Worcester Polytechnic Institute D. Richard Brown III 19-March-2009 1 / 25 Introduction
More informationThe Jeffreys Prior. Yingbo Li MATH Clemson University. Yingbo Li (Clemson) The Jeffreys Prior MATH / 13
The Jeffreys Prior Yingbo Li Clemson University MATH 9810 Yingbo Li (Clemson) The Jeffreys Prior MATH 9810 1 / 13 Sir Harold Jeffreys English mathematician, statistician, geophysicist, and astronomer His
More informationA Discussion of the Bayesian Approach
A Discussion of the Bayesian Approach Reference: Chapter 10 of Theoretical Statistics, Cox and Hinkley, 1974 and Sujit Ghosh s lecture notes David Madigan Statistics The subject of statistics concerns
More informationBasic of Probability Theory for Ph.D. students in Education, Social Sciences and Business (Shing On LEUNG and Hui Ping WU) (May 2015)
Basic of Probability Theory for Ph.D. students in Education, Social Sciences and Business (Shing On LEUNG and Hui Ping WU) (May 2015) This is a series of 3 talks respectively on: A. Probability Theory
More informationSTAT 135 Lab 5 Bootstrapping and Hypothesis Testing
STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,
More informationBayesian performance
Bayesian performance Frequentist properties of estimators refer to the performance of an estimator (say the posterior mean) over repeated experiments under the same conditions. The posterior distribution
More information