Other Noninformative Priors

Similar documents
Bayesian Inference: Posterior Intervals

One-parameter models

Introduction to Applied Bayesian Modeling. ICPSR Day 4

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.

STAT J535: Chapter 5: Classes of Bayesian Priors

The Normal Linear Regression Model with Natural Conjugate Prior. March 7, 2016

Part 2: One-parameter models

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

Bayes: All uncertainty is described using probability.

STAT 425: Introduction to Bayesian Analysis

Density Estimation. Seungjin Choi

David Giles Bayesian Econometrics

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

Lecture 6. Prior distributions

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

Bayesian Inference: Concept and Practice

Bayesian Regression Linear and Logistic Regression

Beta statistics. Keywords. Bayes theorem. Bayes rule

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

Module 22: Bayesian Methods Lecture 9 A: Default prior selection

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Bayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017

Nested Sampling. Brendon J. Brewer. brewer/ Department of Statistics The University of Auckland

A Discussion of the Bayesian Approach

The Jeffreys Prior. Yingbo Li MATH Clemson University. Yingbo Li (Clemson) The Jeffreys Prior MATH / 13

Introduction into Bayesian statistics

Linear Models A linear model is defined by the expression

Classical and Bayesian inference

Chapter 5. Bayesian Statistics

Introduction to Bayesian Methods

Monte Carlo Composition Inversion Acceptance/Rejection Sampling. Direct Simulation. Econ 690. Purdue University

Bernoulli and Poisson models

HPD Intervals / Regions


Data Analysis and Uncertainty Part 2: Estimation

Gibbs Sampling in Latent Variable Models #1

Bayesian Methods for Machine Learning

Predictive Distributions

Introduction to Bayesian Methods. Introduction to Bayesian Methods p.1/??

Direct Simulation Methods #2

CPSC 540: Machine Learning

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

The binomial model. Assume a uniform prior distribution on p(θ). Write the pdf for this distribution.

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

One Parameter Models

New Bayesian methods for model comparison

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Point Estimation. Vibhav Gogate The University of Texas at Dallas

Stat 5101 Lecture Notes


Part III. A Decision-Theoretic Approach and Bayesian testing

Introduction to Probabilistic Machine Learning

Multivariate Normal & Wishart

Part 4: Multi-parameter and normal models

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

Statistical Theory MT 2007 Problems 4: Solution sketches

Principles of Bayesian Inference

Principles of Bayesian Inference

Principles of Bayesian Inference

Bayesian Inference and MCMC

Lecture 1: Bayesian Framework Basics

Integrated Objective Bayesian Estimation and Hypothesis Testing

Principles of Bayesian Inference

Bayesian Graphical Models

Estimation of Quantiles

Parameter Estimation

HW3 Solutions : Applied Bayesian and Computational Statistics

Part 6: Multivariate Normal and Linear Models

Introduction to Bayesian Inference

The joint posterior distribution of the unknown parameters and hidden variables, given the

Bayesian Machine Learning

Why Bayesian? Rigorous approach to address statistical estimation problems. The Bayesian philosophy is mature and powerful.

Lecture 6: Model Checking and Selection

Gibbs Sampling in Linear Models #2

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Bayesian Methods. David S. Rosenberg. New York University. March 20, 2018

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models

Lecture 8 October Bayes Estimators and Average Risk Optimality

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Bayesian Classification Methods

A Bayesian Treatment of Linear Gaussian Regression

A Very Brief Summary of Bayesian Inference, and Examples

Tutorial on Approximate Bayesian Computation

Stat 5102 Final Exam May 14, 2015

Bayesian Methods in Multilevel Regression

Bayesian Inference. Chapter 2: Conjugate models

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling

Bayesian Linear Models

Accouncements. You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian Methods for Estimating the Reliability of Complex Systems Using Heterogeneous Multilevel Information

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)

Empirical Risk Minimization is an incomplete inductive principle Thomas P. Minka

Introduction. Start with a probability distribution f(y θ) for the data. where η is a vector of hyperparameters

10. Exchangeability and hierarchical models Objective. Recommended reading

Introduction to Bayesian Computation

Bayesian statistics, simulation and software

Bayesian analysis in nuclear physics

Transcription:

Other Noninformative Priors Other methods for noninformative priors include Bernardo s reference prior, which seeks a prior that will maximize the discrepancy between the prior and the posterior and minimize the discrepancy between the likelihood and the posterior (a dominant likelihood prior ). An improper prior, in which p(θ) =. Θ A highly diffuse proper prior, e.g., for normal data with µ unknown, a N(0, 1000000) prior for µ. (This is very close to the improper prior p(µ) 1.)

Informative Prior Forms Informative prior information is usually based on expert opinion or previous research about the parameter(s) of interest. Power Priors Suppose we have access to previous data x 0 that is analogous to the data we will gather. Then our power prior could be p(θ x 0, a 0 ) p(θ)[l(θ x 0 )] a 0 where p(θ) is an ordinary prior and a 0 [0, 1] is an exponent measuring the influence of the previous data.

Power Priors As a 0 0, the influence of the previous data is lessened. As a 0 1, the influence of the previous data is strengthened. The posterior, given our actual data x, is then π(θ x, x 0, a 0 ) p(θ x 0, a 0 )L(θ x) To avoid specifying a single a 0 value: We could put a, say, beta distribution p(a 0 ) on a 0 and average over values of a 0 in [0, 1]: p(θ x 0 ) = 1 0 p(θ)[l(θ x 0 )] a 0 p(a 0 ) da 0

Prior Elicitation A challenge is putting expert opinion into a form where it can be used as a prior distribution. Strategies: Requesting guesses for several quantiles (maybe {0.1, 0.25, 0.5, 0.75, 0.9}?) from a few experts. For a normal prior, note that a quantile q(α) is related to the z-value Φ 1 (α) by: q(α) = mean + Φ 1 (α) (std. dev.) Via regression on the provided [q(α), Φ 1 (α)] values, we can get estimates for the mean and standard deviation of the normal prior.

Prior Elicitation Another strategy asks the expert to provide a predictive modal value (most likely value) for the parameter. Then a rough 67% interval is requested from the expert. With a normal prior, the length of this interval is twice the prior standard deviation. For a prior on a Bernoulli probability, the most likely probability of success is often clear.

Spike-and-Slab Priors for Linear Models In regression, the priors on the regression coefficients are crucial. Whether or not β j = 0 defines whether X j is important in the regression. For any j, a useful prior for β j is: h_0j p(β) h_1j f j 0 f j β j

Spike-and-Slab Priors for Linear Models Here: P(β j = 0) = h 0j ( = prior probability that X j is not needed in the model) P(β j 0) = 1 h 0j = h 1j ( fj ( f j ) ) = 2f j h 1j (where [ f j, f j ] contains all reasonable values for β j ) To include X j in the model with certainty, set h 0j = 0. To reflect more doubt that X j should be in the model, increase the ratio h 0j h 1j = h 0j (1 h 0j )/2f j = 2f j h 0j 1 h 0j Recently, nonparametric priors have become popular, typically involving a mixture of Dirichet processes.

CHAPTER 6 SLIDES START HERE

The Monte Carlo Method The Monte Carlo method involves studying a distribution (e.g., a posterior) and its characteristics by generating many random observations having that distribution. If θ (1),..., θ (S) iid π(θ x), then the empirical distribution of {θ (1),..., θ (S) } approximates the posterior, when S is large. By the law of large numbers, as S. 1 S S g(θ (s) ) E[g(θ) x] s=1

The Monte Carlo Method So as S : 1 S 1 θ = 1 S S θ (s) posterior mean s=1 S (θ (s) θ) 2 posterior variance s=1 #{θ (s) c} P[θ c x] S median{θ (1),..., θ (S) } posterior median (and similarly for any posterior quantile).

The Monte Carlo Method If the posterior is a common distribution, as in many conjugate analyses, we could draw samples from the posterior using R functions. Example 1: (General Social Survey) Sample 1: # of children for women age 40+, no bachelor s degree. Sample 2: # of children for women age 40+, bachelor s degree or higher. Assume Poisson(θ 1 ) and Poisson(θ 2 ) models for the data. We use gamma(2,1) priors for θ 1 and for θ 2.

The Monte Carlo Method Data: n 1 = 111, i x i1 = 217 Data: n 2 = 44, i x i2 = 66 Posterior for θ 1 is gamma(219,112). Posterior for θ 2 is gamma(68, 45). Find P[θ 1 > θ 2 x 1, x 2 ]. Find posterior distribution of the ratio θ 1 θ 2. See R example using Monte Carlo method on course web page.