PS 203 Spring 2002 Homework One - Answer Key

Size: px
Start display at page:

Download "PS 203 Spring 2002 Homework One - Answer Key"


1 PS 203 Spring 2002 Homework One - Answer Key 1. If you have a home or office computer, download and install WinBUGS. If you don t have your own computer, try running WinBUGS in the Department lab. 2. The data set aes96media (on the PS203 website) contains survey data on voting in the 1996 Australian Federal Election. The variables are: y, a dummy for whether survey respondent i voted for the Labor Party (1) or not (0) PID, an indicator of partisanship (1 for Strong Labor, 2 for Weak Labor, 3 for Lean Labor, 4 for Independents, 5 for Lean Conservative, 6 for Weak Conservative, 7 for Strong Conservative). media, a scale measure of media consumption through the election campaign (0 through 1 corresponding to low through high) quiz, a scale measure of the respondent s level of political information, ascertained through a series of objective true/false items administered at the end of the survey (0 through 1 corresponding to low through high) Use the available predictors to model the voting outcomes y, via logit. Use a series of dummy variables for each level of party identification (collapse the weak and strong conservative categories, since no strong conservatives voted Labor). Include an interaction between media and quiz. (a) Briefly interpret the coefficients and report on the fit of the model to the data. The party identification dummies perform as expected, with a steady monotonic decreasing pattern from Strong Labor through to Weak/Strong Conservative. The media and quiz coefficients tap the effect of a unit change in one of these variables when the other is set to its lowest level. Thus, the coefficient on the media exposure variable suggests that when political information is at its lowest level (zero), increased media exposure decreases the probability of voting Labor. Alternatively, the coefficient on quiz suggests that when media exposure is at its lowest level (zero), as political information increases, the probability of a Labor vote also decreases. The positive interaction term indicates that as both variables increase, these negative effects eventually turn into positive effects (see the next question and Figure 1). (b) Use the estimated coefficients to solve for the level of political information z such that conditional on z, media consumption has no impact on the probability of voting for the ALP. The logit model can be written as p i = F(l i ) l i = α + x i1 b 1 + x i2 b 2 + x i1 x i2 b 3 1

2 and we seek z = x i2 such that p i / x i1 = 0. Note that Note that f (l i ) 0, l i, so p i = F(l i) l i x i1 l i x i1 = f (l i ) (b 1 + x i2 b 3 ) p i x i1 = 0 b 1 + x i2 b 3 = 0 z = -b 1 b 3, where b 1 is the coefficient on media and b 3 is the coefficient on the interaction between media and quiz. This ratio is / = That is, for respondents with a quiz score of.66, there is no relationship between media exposure and the probability of voting Labor. Note that this is a very high level of quiz, corresponding to the 77th to 87th percentiles of this variable. Note also that for values of quiz greater than.66, the effect of media exposure is actually positive. See the contour plot in Figure 1. (c) Use simulation methods to obtain a 95% confidence bound for z. To do this, I simply sampled from the multivariate Normal distribution implied for b by the MLEs. That is, from a Bayesian perspective if we have flat priors over b, then the posterior for b is proportional to the likelihood, and so ) p(b data) N (ˆbMLE, V(ˆb MLE ). Then we can induce a posterior on g(b), denoted p(g(b) data), by repeating the following steps many times (t = 1,..., T): i. sample b (t) from p(b data) ii. form g (t) = g(b (t). In this case, g(b) = -b 1 b 3. Since this is a ratio of two (correlated) random variables, p(g(b) data) has Cauchy-like properties with extremely heavy tails. In fact, the more Monte Carlo simulations we draw, the further we probe into the heavy tails. With 500,000 draws, the median of p(g(b) data),.66 is equal to the value implied by the MLEs, and with a 95 percent confidence interval extending outside the unit interval on which quiz is measured. A fifty percent bound (the inter-quartile range) is [.56,.87]. 3. Consider the lung cancer data presented in class (from Johnson and Albert s Ordinal Data Modeling, p35). Eight-six lung cancer patients and a matched sample of 86 controls were questioned about their smoking habits. The two groups were chosen to represent random samples from a subpopulation of lung-cancer patietns and an otherwise similar population of cancer-free individuals. The following table summarises the data: Cancer Control Smokers Nonsmokers

3 Quiz Media 0.05 Figure 1: Predicted Probabilities of Labor Vote, as a function of Quiz (political information) and Media (media exposure) 3

4 Let 0 < p L < 1 and 0 < p C < 1 denote the population proportions of lung-cancer patients and controls who smoke, respectively. Assume a binomial model for the data and independence (both within and across groups). (a) With uninformative (uniform) priors on p L and p C, report the posterior means of these parameters, along with 95% credible intervals. The uninformative (uniform) priors for p L and p C are equivalent to Beta(1,1) distributions, yielding the posteriors with the following characteristics: p(p L data) Beta(83 + 1, 3 + 1) p(p C data) Beta(72 + 1, ) Parameter Mean Mode (MLE) 2.5% 97.5% p L 84/ / p L 73/ / (b) Consider the quantity d = p L - p C. With the same uninformative priors on p L and p C, summarize the posterior density implied by the model for d. d has a posterior mean of.125, with a 95 percent confidence interval extending from.04 to.22. (c) Compare your Bayesian inferences about p L, p C and d with those from a classical, likelihood analysis. For p L and p C, see the table above. Via independence across the two groups, the MLE is simply of d is simply the difference of the two within-group point estimates, or (83-72)/86 = 11/86 or about.128. Note that point estimate corresponds to the posterior mode obtained with flat priors. To obtain a confidence interval for d, I rely on asymptotically-valid normal approximations as summaries of uncertainty in the group-specific MLEs. This is convenient, since the normal is completely characterized by its mean and variance, and so a 95% confidence interval for the MLE of d can then be obtained by simply adding/substracting 1.96 standard errors to the MLE of d. By independence across groups, the variance of the MLE of d is V(ˆd MLE ) = V( ˆp L MLE - ˆp C MLE) = V( ˆp L MLE) + V( ˆp C MLE) = p L(1 - p L ) + p C(1 - p C ) n n times the square root of this variance is the half-width of a 95% bound for d is [.041,.215], which is not dissimilar from that obtained via the Bayesian simulation procedure. That is, the asymptotic normal approximation is not bad with this sample size (n.b., n=86). 4

5 (d) Prior Sensitivity Analysis: Imagine that after seeing the data in the table (above), a skeptic maintains that he is still not convinced that p L > p C. Assume this skeptic has an uninformative prior on p C. Find a prior on p L that rationalizes the skeptic s posterior beliefs. There is an infinite set of Beta priors for p L that will rationalize the skeptic s beliefs. To provide a sense of the mapping from prior to posterior, I use a data-equivalent representation of the family of priors, summarizing a Beta (α, b) density with its mean, and an equivalent sample size. I summarize the mapping from the skeptic s priors (a two-space, since the Beta density takes two parameters), into the posterior probability that p L > p C. 4. In generating state-level forecasts of the 2000 presidential vote, Jackman and Rivers used historical election results as priors. For instance, the average of Democratic presidential vote share in 1988, 1992 and 1996 was used to generate a prior for forecasting the 2000 outcomes. For California, this averaging of historical results yields a prior mean for Democratic vote share of of 48.4%. Jackman and Rivers complete the specification of their prior by assuming that after controlling for period-specific national-level shocks, vote shares vary randomly around a stable long-term average level specific to each state. They estimated this within-state random component to have a standard deviation of 3.1 percentage points. This prior information is to be combined with poll numbers from the 2000 election season to generate state-level forecasts. For instance, a Zogby poll of 436 Californian likely voters fielded on August 23, 2000 found 42% support for Gore. Use Bayesian methods to combine the historical prior information with the poll information to come up with a posterior density over Gore support in California. Report the posterior mean and a 95% confidence interval. Hints: you will have to first convert the prior information into a form suitable for pooling with the poll data (or vice-versa). For instance, if you assume a binomial model for the poll data, then you will have to convert the historical prior information into a conjugate Beta density. On the other hand, you might assume that a normal model and prior is a suitable characterization of the information in the poll and the historical data, in which case you will need to convert the poll information into a form captured by the parameters of a normal distribution. First, try converting the poll information into a form suitable for pooling (via Bayes Rule) with the historical information. The historical information is expressed as a mean and a standard deviation, which, for convenience, we can interpret as the sufficient statistics of a normal distribution. The poll information can also be expressed in terms of the sufficient statistics of a normal distribution, i.e., mean =.42 and variance var(p) = p(1 - p) n = = while the historical analysis yields a variance of = Pooling this information via Bayes Rule yields a variance of v = ( ) -1 = 1/ =

6 Prior Precision as Equivalent N Prior Pr(Smoker Lung Cancer) Figure 2: Mapping from Prior over p L to Posterior Mean of d. The contour lines connect points in the prior space for p L (defined as a prior mean and an equivalent prior n) that give rise to the same posterior mean for d. For instance, an uninformative prior (prior mean =.5 and prior sample size of zero) yields a posterior mean for d of just over.1. The observed data for lung cancer patients (solid square) and the control group (open circle) are also represented in this prior space for comparison. 6

7 or a standard deviation of 1.88 percentage points. The pooled (or posterior) mean is x = = / Another approach is to turn the historical information into a form suitable for pooling with the poll information. This can be done by treating the historical information as the equivalent of a Beta prior for the binomial poll data. The historical information has mean.484 and variance , which we can use to solve for the parameters of a Beta (α, b) distribution, noting that α α + b =.484 αb (α + b) 2 (α + b + 1) = Solving for α and b yields α and b The binomial data from the poll can be represented as y = successes from n = 436 trials, and so the posterior is a Beta density with parameters = and = This Beta distribution has mean.444, and variance , or standard deviation about 1.88 percentage points. Note that the two approaches to this problem yield identical answers. 5. Given data y = (y 1, y 2,..., y n ), consider the model y i iid N(h 1 + h 2, 1), i = 1,..., n. Prove that (a) h 1 and h 2 are unidentified. The likelihood for these iid normal data is n p(y; h 1, h 2, r 2 = 1) = p(y i ; h 1, h 2, r 2 = 1) and so lnp(y) -1 2 = = = n (y i - h 1 - h 2 ) 2 i=1 i=1 n u(y i ; h 1, h 2, r 2 = 1) i=1 n [ ] 1 -(yi - h 1 - h 2 ) 2 exp i=1 2p 2 [ n - n i=1 exp (y ] i - h 1 - h 2 ) 2 2p 2 = -1 2 ( y 2 i + nh nh 2 2-2h 1 yi - 2h 2 yi + 2nh 1 h 2 ) 7

8 Now we have the following derivatives: lnp(y) h 1 = -nh 1 + y i - nh 2 lnp(y) h 2 = -nh 1 + y i - nh 2 2 lnp(y) = -n h lnp(y) h 2 2 = -n 2 lnp(y) h 1 h 2 = 2 lnp(y) h 2 h 1 = -n and so the Hessian (the matrix of second derivatives) of the log-likelihood is [ ] -n -n H = = -ni -n -n which is clearly not of full column rank (column one is a linear combination of column two, and vice-versa), and hence singular. This implies that the likelihood function does not have a unique maximum with respect to h = (h 1, h 2 ) and so the parameters are not identified. (b) h 1 + h 2 is identified. This is rather trivial and I will not elaborate here. The model for the mean is now re-parameterized as l = h 1 + h 2. Twice differentiate the log-likelihood function with respect to l; the 2nd derivative is -n, implying that the likelihood over l has a unique maximum. (c) normal priors with finite variances on h 1 and h 2 are sufficient to identify h 1 and h 2. Harder problem. The strategy of proof is to note that since a posterior is proportional to a prior times a likelihood, a log-posterior is proportional to the log prior plus the log-likelihood, and further, the Hessian of the log-posterior equals the Hessian of the log-prior plus the Hessian of the log-likelihood. We have shown that the Hessian of the log-likelihood is of not full rank. It remains to be shown that with proper priors, this is no longer the case and, further, that the Hessian of the log-posterior is now negative definite, implying a unique posterior mode for h = (h 1, h 2 ). 8

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007 Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

2 Belief, probability and exchangeability

2 Belief, probability and exchangeability 2 Belief, probability and exchangeability We first discuss what properties a reasonable belief function should have, and show that probabilities have these properties. Then, we review the basic machinery

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Inferences About Two Proportions

Inferences About Two Proportions Inferences About Two Proportions Quantitative Methods II Plan for Today Sampling two populations Confidence intervals for differences of two proportions Testing the difference of proportions Examples 1

More information

Bayesian Classification Methods

Bayesian Classification Methods Bayesian Classification Methods Suchit Mehrotra North Carolina State University October 24, 2014 Suchit Mehrotra (NCSU) Bayesian Classification October 24, 2014 1 / 33 How do you define

More information

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006 Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)

More information

Bayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017

Bayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017 Bayesian inference Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 10, 2017 1 / 22 Outline for today A genetic example Bayes theorem Examples Priors Posterior summaries

More information

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Theory of Maximum Likelihood Estimation. Konstantin Kashin Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical

More information

Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2

Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Logistics CSE 446: Point Estimation Winter 2012 PS2 out shortly Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Last Time Random variables, distributions Marginal, joint & conditional

More information

Introduction to Bayesian Methods

Introduction to Bayesian Methods Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

How to Use the Internet for Election Surveys

How to Use the Internet for Election Surveys How to Use the Internet for Election Surveys Simon Jackman and Douglas Rivers Stanford University and Polimetrix, Inc. May 9, 2008 Theory and Practice Practice Theory Works Doesn t work Works Great! Black

More information

Strong Lens Modeling (II): Statistical Methods

Strong Lens Modeling (II): Statistical Methods Strong Lens Modeling (II): Statistical Methods Chuck Keeton Rutgers, the State University of New Jersey Probability theory multiple random variables, a and b joint distribution p(a, b) conditional distribution

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio Estimation of reliability parameters from Experimental data (Parte 2) This lecture Life test (t 1,t 2,...,t n ) Estimate θ of f T t θ For example: λ of f T (t)= λe - λt Classical approach (frequentist

More information

Weakly informative priors

Weakly informative priors Department of Statistics and Department of Political Science Columbia University 21 Oct 2011 Collaborators (in order of appearance): Gary King, Frederic Bois, Aleks Jakulin, Vince Dorie, Sophia Rabe-Hesketh,

More information

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Why uncertainty? Why should data mining care about uncertainty? We

More information

Likelihood and Bayesian Inference for Proportions

Likelihood and Bayesian Inference for Proportions Likelihood and Bayesian Inference for Proportions September 9, 2009 Readings Hoff Chapter 3 Likelihood and Bayesian Inferencefor Proportions p.1/21 Giardia In a New Zealand research program on human health

More information

Introduction to Bayesian Inference

Introduction to Bayesian Inference University of Pennsylvania EABCN Training School May 10, 2016 Bayesian Inference Ingredients of Bayesian Analysis: Likelihood function p(y φ) Prior density p(φ) Marginal data density p(y ) = p(y φ)p(φ)dφ

More information

Bayesian Inference for Regression Parameters

Bayesian Inference for Regression Parameters Bayesian Inference for Regression Parameters 1 Bayesian inference for simple linear regression parameters follows the usual pattern for all Bayesian analyses: 1. Form a prior distribution over all unknown

More information

Part 4: Multi-parameter and normal models

Part 4: Multi-parameter and normal models Part 4: Multi-parameter and normal models 1 The normal model Perhaps the most useful (or utilized) probability model for data analysis is the normal distribution There are several reasons for this, e.g.,

More information

Part 2: One-parameter models

Part 2: One-parameter models Part 2: One-parameter models 1 Bernoulli/binomial models Return to iid Y 1,...,Y n Bin(1, ). The sampling model/likelihood is p(y 1,...,y n ) = P y i (1 ) n P y i When combined with a prior p( ), Bayes

More information

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling Due: Tuesday, May 10, 2016, at 6pm (Submit via NYU Classes) Instructions: Your answers to the questions below, including

More information

DS-GA 1002 Lecture notes 11 Fall Bayesian statistics

DS-GA 1002 Lecture notes 11 Fall Bayesian statistics DS-GA 100 Lecture notes 11 Fall 016 Bayesian statistics In the frequentist paradigm we model the data as realizations from a distribution that depends on deterministic parameters. In contrast, in Bayesian

More information


TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 1.1 The Probability Model...1 1.2 Finite Discrete Models with Equally Likely Outcomes...5 1.2.1 Tree Diagrams...6 1.2.2 The Multiplication Principle...8

More information

Applied Bayesian Statistics STAT 388/488

Applied Bayesian Statistics STAT 388/488 STAT 388/488 Dr. Earvin Balderama Department of Mathematics & Statistics Loyola University Chicago August 29, 207 Course Info STAT 388/488 2 A motivating example (See

More information

Lecture 10 and 11: Text and Discrete Distributions

Lecture 10 and 11: Text and Discrete Distributions Lecture 10 and 11: Text and Discrete Distributions Machine Learning 4F13, Spring 2014 Carl Edward Rasmussen and Zoubin Ghahramani CUED Rasmussen and Ghahramani Lecture

More information

(4) One-parameter models - Beta/binomial. ST440/550: Applied Bayesian Statistics

(4) One-parameter models - Beta/binomial. ST440/550: Applied Bayesian Statistics Estimating a proportion using the beta/binomial model A fundamental task in statistics is to estimate a proportion using a series of trials: What is the success probability of a new cancer treatment? What

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu State Key Lab of Intelligent

More information

Bayesian Modeling of Accelerated Life Tests with Random Effects

Bayesian Modeling of Accelerated Life Tests with Random Effects Bayesian Modeling of Accelerated Life Tests with Random Effects Ramón V. León Avery J. Ashby Jayanth Thyagarajan Joint Statistical Meeting August, 00 Toronto, Canada Abstract We show how to use Bayesian

More information

Machine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables?

Machine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables? Linear Regression Machine Learning CSE546 Sham Kakade University of Washington Oct 4, 2016 1 What about continuous variables? Billionaire says: If I am measuring a continuous variable, what can you do

More information

Model Averaging (Bayesian Learning)

Model Averaging (Bayesian Learning) Model Averaging (Bayesian Learning) We want to predict the output Y of a new case that has input X = x given the training examples e: p(y x e) = m M P(Y m x e) = m M P(Y m x e)p(m x e) = m M P(Y m x)p(m

More information

Bernoulli and Poisson models

Bernoulli and Poisson models Bernoulli and Poisson models Bernoulli/binomial models Return to iid Y 1,...,Y n Bin(1, ). The sampling model/likelihood is p(y 1,...,y n ) = P y i (1 ) n P y i When combined with a prior p( ), Bayes rule

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Lecture Notes for BUSINESS STATISTICS - BMGT 571. Chapters 1 through 6. Professor Ahmadi, Ph.D. Department of Management

Lecture Notes for BUSINESS STATISTICS - BMGT 571. Chapters 1 through 6. Professor Ahmadi, Ph.D. Department of Management Lecture Notes for BUSINESS STATISTICS - BMGT 571 Chapters 1 through 6 Professor Ahmadi, Ph.D. Department of Management Revised May 005 Glossary of Terms: Statistics Chapter 1 Data Data Set Elements Variable

More information

UC Berkeley Math 10B, Spring 2015: Midterm 2 Prof. Sturmfels, April 9, SOLUTIONS

UC Berkeley Math 10B, Spring 2015: Midterm 2 Prof. Sturmfels, April 9, SOLUTIONS UC Berkeley Math 10B, Spring 2015: Midterm 2 Prof. Sturmfels, April 9, SOLUTIONS 1. (5 points) You are a pollster for the 2016 presidential elections. You ask 0 random people whether they would vote for

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Weakly informative priors

Weakly informative priors Department of Statistics and Department of Political Science Columbia University 23 Apr 2014 Collaborators (in order of appearance): Gary King, Frederic Bois, Aleks Jakulin, Vince Dorie, Sophia Rabe-Hesketh,

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford Maximum Likelihood Principle A generative model for

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Likelihood and Bayesian Inference for Proportions

Likelihood and Bayesian Inference for Proportions Likelihood and Bayesian Inference for Proportions September 18, 2007 Readings Chapter 5 HH Likelihood and Bayesian Inferencefor Proportions p. 1/24 Giardia In a New Zealand research program on human health

More information

Overview. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda

Overview. DS GA 1002 Probability and Statistics for Data Science.   Carlos Fernandez-Granda Overview DS GA 1002 Probability and Statistics for Data Science Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing

More information

Statistics 135 Fall 2007 Midterm Exam

Statistics 135 Fall 2007 Midterm Exam Name: Student ID Number: Statistics 135 Fall 007 Midterm Exam Ignore the finite population correction in all relevant problems. The exam is closed book, but some possibly useful facts about probability

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

Warwick Business School Forecasting System. Summary. Ana Galvao, Anthony Garratt and James Mitchell November, 2014

Warwick Business School Forecasting System. Summary. Ana Galvao, Anthony Garratt and James Mitchell November, 2014 Warwick Business School Forecasting System Summary Ana Galvao, Anthony Garratt and James Mitchell November, 21 The main objective of the Warwick Business School Forecasting System is to provide competitive

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices.

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1.(10) What is usually true about a parameter of a model? A. It is a known number B. It is determined by the data C. It is an

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Bayesian Inference for Normal Mean

Bayesian Inference for Normal Mean Al Nosedal. University of Toronto. November 18, 2015 Likelihood of Single Observation The conditional observation distribution of y µ is Normal with mean µ and variance σ 2, which is known. Its density

More information

Bayesian Statistics. Debdeep Pati Florida State University. February 11, 2016

Bayesian Statistics. Debdeep Pati Florida State University. February 11, 2016 Bayesian Statistics Debdeep Pati Florida State University February 11, 2016 Historical Background Historical Background Historical Background Brief History of Bayesian Statistics 1764-1838: called probability

More information

New Bayesian methods for model comparison

New Bayesian methods for model comparison Back to the future New Bayesian methods for model comparison Murray Aitkin Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison

More information

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January

More information

Chapter 5. Bayesian Statistics

Chapter 5. Bayesian Statistics Chapter 5. Bayesian Statistics Principles of Bayesian Statistics Anything unknown is given a probability distribution, representing degrees of belief [subjective probability]. Degrees of belief [subjective

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

Probability and Estimation. Alan Moses

Probability and Estimation. Alan Moses Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.

More information

Bayesian Inference for Binomial Proportion

Bayesian Inference for Binomial Proportion 8 Bayesian Inference for Binomial Proportion Frequently there is a large population where π, a proportion of the population, has some attribute. For instance, the population could be registered voters

More information

1 Inference for binomial proportion (Matlab/Python)

1 Inference for binomial proportion (Matlab/Python) Bayesian data analysis exercises from 2015 1 Inference for binomial proportion (Matlab/Python) Algae status is monitored in 274 sites at Finnish lakes and rivers. The observations for the 2008 algae status

More information

Eco517 Fall 2014 C. Sims FINAL EXAM

Eco517 Fall 2014 C. Sims FINAL EXAM Eco517 Fall 2014 C. Sims FINAL EXAM This is a three hour exam. You may refer to books, notes, or computer equipment during the exam. You may not communicate, either electronically or in any other way,

More information

A Very Brief Summary of Bayesian Inference, and Examples

A Very Brief Summary of Bayesian Inference, and Examples A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

Classification. Classification is similar to regression in that the goal is to use covariates to predict on outcome.

Classification. Classification is similar to regression in that the goal is to use covariates to predict on outcome. Classification Classification is similar to regression in that the goal is to use covariates to predict on outcome. We still have a vector of covariates X. However, the response is binary (or a few classes),

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Week 5: Logistic Regression & Neural Networks

Week 5: Logistic Regression & Neural Networks Week 5: Logistic Regression & Neural Networks Instructor: Sergey Levine 1 Summary: Logistic Regression In the previous lecture, we covered logistic regression. To recap, logistic regression models and

More information

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013 Bayesian Methods Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2013 1 What about prior n Billionaire says: Wait, I know that the thumbtack is close to 50-50. What can you

More information

One-parameter models

One-parameter models One-parameter models Patrick Breheny January 22 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/17 Introduction Binomial data is not the only example in which Bayesian solutions can be worked

More information

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS 1a. Under the null hypothesis X has the binomial (100,.5) distribution with E(X) = 50 and SE(X) = 5. So P ( X 50 > 10) is (approximately) two tails

More information

Bayesian Methods in Multilevel Regression

Bayesian Methods in Multilevel Regression Bayesian Methods in Multilevel Regression Joop Hox MuLOG, 15 september 2000 mcmc What is Statistics?! Statistics is about uncertainty To err is human, to forgive divine, but to include errors in your design

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: October

More information

Density Estimation: ML, MAP, Bayesian estimation

Density Estimation: ML, MAP, Bayesian estimation Density Estimation: ML, MAP, Bayesian estimation CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Maximum-Likelihood Estimation Maximum

More information

Intro to Probability. Andrei Barbu

Intro to Probability. Andrei Barbu Intro to Probability Andrei Barbu Some problems Some problems A means to capture uncertainty Some problems A means to capture uncertainty You have data from two sources, are they different? Some problems

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Hypothesis testing for µ:

Hypothesis testing for µ: University of California, Los Angeles Department of Statistics Statistics 10 Elements of a hypothesis test: Hypothesis testing Instructor: Nicolas Christou 1. Null hypothesis, H 0 (always =). 2. Alternative

More information

A Bayesian perspective on GMM and IV

A Bayesian perspective on GMM and IV A Bayesian perspective on GMM and IV Christopher A. Sims Princeton University November 26, 2013 What is a Bayesian perspective? A Bayesian perspective on scientific reporting views all

More information

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Beta statistics. Keywords. Bayes theorem. Bayes rule

Beta statistics. Keywords. Bayes theorem. Bayes rule Keywords Beta statistics Tommy Norberg Mathematical Sciences Chalmers University of Technology Gothenburg, SWEDEN Bayes s formula Prior density Likelihood Posterior density Conjugate

More information

Bayesian Estimation An Informal Introduction

Bayesian Estimation An Informal Introduction Mary Parker, Bayesian Estimation An Informal Introduction page 1 of 8 Bayesian Estimation An Informal Introduction Example: I take a coin out of my pocket and I want to estimate the probability of heads

More information

Eco517 Fall 2014 C. Sims MIDTERM EXAM

Eco517 Fall 2014 C. Sims MIDTERM EXAM Eco57 Fall 204 C. Sims MIDTERM EXAM You have 90 minutes for this exam and there are a total of 90 points. The points for each question are listed at the beginning of the question. Answer all questions.

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2 Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate

More information

Introduction to Bayesian Methods. Introduction to Bayesian Methods p.1/??

Introduction to Bayesian Methods. Introduction to Bayesian Methods p.1/?? to Bayesian Methods Introduction to Bayesian Methods p.1/?? We develop the Bayesian paradigm for parametric inference. To this end, suppose we conduct (or wish to design) a study, in which the parameter

More information

Learning Objectives for Stat 225

Learning Objectives for Stat 225 Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:

More information

Lecture 01: Introduction

Lecture 01: Introduction Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction

More information

Announcements. Lecture 5: Probability. Dangling threads from last week: Mean vs. median. Dangling threads from last week: Sampling bias

Announcements. Lecture 5: Probability. Dangling threads from last week: Mean vs. median. Dangling threads from last week: Sampling bias Recap Announcements Lecture 5: Statistics 101 Mine Çetinkaya-Rundel September 13, 2011 HW1 due TA hours Thursday - Sunday 4pm - 9pm at Old Chem 211A If you added the class last week please make sure to

More information

Models of Reputation with Bayesian Updating

Models of Reputation with Bayesian Updating Models of Reputation with Bayesian Updating Jia Chen 1 The Tariff Game (Downs and Rocke 1996) 1.1 Basic Setting Two states, A and B, are setting the tariffs for trade. The basic setting of the game resembles

More information

Bayesian Inference. p(y)

Bayesian Inference. p(y) Bayesian Inference There are different ways to interpret a probability statement in a real world setting. Frequentist interpretations of probability apply to situations that can be repeated many times,

More information

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables?

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables? Linear Regression Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2014 1 What about continuous variables? n Billionaire says: If I am measuring a continuous variable, what

More information

A Discussion of the Bayesian Approach

A Discussion of the Bayesian Approach A Discussion of the Bayesian Approach Reference: Chapter 10 of Theoretical Statistics, Cox and Hinkley, 1974 and Sujit Ghosh s lecture notes David Madigan Statistics The subject of statistics concerns

More information

Monday, September 10 Handout: Random Processes, Probability, Random Variables, and Probability Distributions

Monday, September 10 Handout: Random Processes, Probability, Random Variables, and Probability Distributions Amherst College Department of Economics Economics 360 Fall 202 Monday, September 0 Handout: Random Processes, Probability, Random Variables, and Probability Distributions Preview Random Processes and Probability

More information

Introduction to Bayesian Statistics. James Swain University of Alabama in Huntsville ISEEM Department

Introduction to Bayesian Statistics. James Swain University of Alabama in Huntsville ISEEM Department Introduction to Bayesian Statistics James Swain University of Alabama in Huntsville ISEEM Department Author Introduction James J. Swain is Professor of Industrial and Systems Engineering Management at

More information

Inference when identifying assumptions are doubted. A. Theory B. Applications

Inference when identifying assumptions are doubted. A. Theory B. Applications Inference when identifying assumptions are doubted A. Theory B. Applications 1 A. Theory Structural model of interest: A y t B 1 y t1 B m y tm u t nn n1 u t i.i.d. N0, D D diagonal 2 Bayesian approach:

More information

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices.

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1. What is the difference between a deterministic model and a probabilistic model? (Two or three sentences only). 2. What is the

More information

Bayesian Methods. David S. Rosenberg. New York University. March 20, 2018

Bayesian Methods. David S. Rosenberg. New York University. March 20, 2018 Bayesian Methods David S. Rosenberg New York University March 20, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 March 20, 2018 1 / 38 Contents 1 Classical Statistics 2 Bayesian

More information