Basic of Probability Theory for Ph.D. students in Education, Social Sciences and Business (Shing On LEUNG and Hui Ping WU) (May 2015)

Size: px
Start display at page:

Download "Basic of Probability Theory for Ph.D. students in Education, Social Sciences and Business (Shing On LEUNG and Hui Ping WU) (May 2015)"

Transcription

1 Basic of Probability Theory for Ph.D. students in Education, Social Sciences and Business (Shing On LEUNG and Hui Ping WU) (May 2015) This is a series of 3 talks respectively on: A. Probability Theory B. Hypothesis Testing C. Bayesian Inference Lecture 3: Bayesian Inference (Most statistical details can be found via web-search. These lectures emphasize on conceptual understanding instead of technical details.) 1

2 C. Bayesian approach Bayes Theorem Pr(A/B) = Pr(A & B) / Pr(B) = Pr(B/A) * Pr(A) / Pr(B) Pr(B) = Pr(B/A)*Pr(A) + Pr(B/not A)*Pr(not A) For our convenience, let A = θ (model), B = X (observations) So, Pr(θ/X) = Pr(θ and X) / Pr(X) = Pr(X/θ) * Pr(θ) / Pr(X) 2

3 The roles of X and θ are interchanged Classical Pr(X/θ): sampling distribution of data Frequentist implicitly assume many realization of data (e.g. each day it can be raining or not raining), but in reality only one (each day only one event happen: rain or not rain) (Yuen 2011) Bayesian Pr(θ/X): uncertainty over the parameter space Bayesianly, raining yesterday is X (happened, only once), data; tomorrow is y (predictive), which is governed by θ, parameter 3

4 An easy example: 4

5 5

6 M 1 = 1 black and 9 white balls, θ=θ 1 =0.1 M 2 = 9 black and 1 white ball, θ=θ 2 =0.9 Procedure: select a bag at random (Pr(θ 1 )=Pr(θ 2 )=0.5), and select 1 ball, and guess which bag (M 1 or M 2 ) it comes from X=B if the selected ball is black; X=W if it is white Pr(X=B) = Pr(X=B/θ 1 )*Pr(θ 1 ) + Pr(X=B/θ 2 )*Pr(θ 2 ) = 0.1* * 0.5 = 0.5 Pr(θ 1 /X=B) = Pr(X=B/θ 1 )*Pr(θ 1 ) / Pr(X=B) = 0.05 / 0.5 = 0.1 Pr(θ 2 /X=B) = 0.9 Pr(θ 1 /B) vs Pr(θ 2 /B) = 0.1 vs 0.9 6

7 Interpretation If the ball is black, we guess it is come from M 2, otherwise, M 1 We make inference (M) based on our observations (X) X can be medical symptom (data), M can be disease X can be students achievements, M can be SES, efforts, esteem, attitude, etc. * Important: Classical Pr(X/θ) Vs Bayesian Pr(θ/X) Prior and Posterior (In its continuous forms) p(θ/x) = p(x/θ) *p(θ) / p(x), and, p(x) =ʃp(x/θ) *p(θ) dθ Pr(M) or p(θ) is prior Pr(M/X) or p(θ/x) is posterior 7

8 Simple example with Hypothesis Testing (Classical Approach) H 0 : The ball is from M 1 H 1 : The ball is from M 2 Decision rule: If the ball is Black (X=B), we reject H 0 and said the ball is from M 2. Under H 0, Pr(X=B/M 1 ) = 0.1 > 0.05, so we do not reject H 0. (Unless 5 black out of 100 balls) With p=0.1, we can use the term marginal significant. Still, we are protecting the H 0. Under H 1, Pr(X=B/M 2 ) = 0.9 (power of the test is good) Pr(X=W/M 2 ) = 0.1 (this is type 2 error) But, this is without knowing what happen in M 2, as we concentrate on M 1. Hence, power can be less, and type 2 error can be large. No statement on Pr(θ 1 /B) vs Pr(θ 2 /B) = 0.1 vs 0.9 8

9 Credible Interval Vs Confident Interval A credible interval (CreI) is an interval in Pr(θ/X) (posterior) to specify the most possible range (say 95%) of a parameter. In multi-dimensional parameter space, it is the credible region A 95% Confident Interval (ConI): If we have many samples, 95% of ConI will contain the true value (in practice we got only one sample). Here, parameters are fixed and intervals are random CreI not equal to ConI because (i) prior exist, and (ii) treatment of nuisance parameters For (i), if prior are "reasonably unbiased", the differences are minor. For (ii), ConI (or classical approach) take mle values of nuisance parameters. Bayesian has to conduct integrations (and that is the most difficult task) 9

10 Simple Example: t-test Golf scores for males and females are: Male: 82, 80, 85, 85, 78, 87, 82; Female: 75, 76, 80, 77, 80, 77, 73 Sample difference: y = - = 5.85 N = 7 for both sex (but not necessary) t-test from SPSS, path: Analyze Compare mean Independent samples t-test t = 3.83, df = 12, p-value = 0.002, 95%, CI (2.52, 9.19) 10

11 Bayesian Inference of this example x 1 ~ N(µ 1, σ 1 2 ), and x 2 ~ N(µ 2, σ 2 2 ), then ~ N(µ 1, σ 1 2 /n 1 ), ~ N(µ 2, σ 2 2 /n 2 ) Assume σ 1 and σ 2 are known, but (later) Let θ=µ 1 -µ 2, parameter on differences, and y = - be the random variable y θ ~ N(θ, σ 1 2 /n 1 + σ 2 2 /n 2 ) p(θ y) = p(y θ) * p(θ) / p(y); or p(θ y) p(y θ) * p(θ) Assume: Prior of θ ~ N(0, σ 3 2 ); 0 imply no biase; σ 3 known if every things is Normal, p(θ y), posterior is Normal, and θ y ~ N(µ, σ) where and. Credible Interval is CreI (U 1, U 2 ) = (µ-2*σ, µ+2*σ) 11

12 But, Everything is Normal is not always the case If σ is unknown, p(θ y) is not Normal even everything is Normal In general, the posterior p(θ y) can be very complicated. This is one of the difficulties in using Bayesian approach. 12

13 Prior and Credible Interval σ 3 µ CreI(U 1, U 2 ) (0.38, 3.13) (0.80, 6.58) (1.00, 8.28) (1.24, 10.20) (1.26, 10.43) CI 5.85 (2.52, 9.19) 13

14 Interpretation: Final results is a mixture between: (i) prior, (ii) observation The prior mean is 0. When prior variance, σ 3, is small, the effect of prior is stronger and the final results will be closer to 0. If σ 3 is known and is very large, we do not have any strong assumption on prior belief (unbiased prior or non-informative prior), final result will depend largely on observations Generally, non-informative (or large variance) prior makes Bayesian similar to Classical Another factor: σ 3 is assumed to be known, which is unrealistic. This is call nuisance parameters (or hyper-parameters) Everyone uses computer packages, later, we introduce Bayesian t-test web-based calculator. There are many others. 14

15 Nuisance parameters (Nuisance but important!) Classically, we take nuisance parameters, σ 3, at its mle Bayesian Evidences, p(θ/x) = p(x/θ) *p(θ) / p(x), or Posterior = Evidences * Prior / p(x) p(x/θ) = ʃp(x, σ 3 /θ) dσ 3, called evidence To compute evidences, we need to integrate ʃ over all possible values of σ 3 in the parameter space. "... this averaging automatically controls the complexity of different models..." (Wetzels et al 2011) 15

16 The following 5 terms carry the same technical implications i. penalty towards extra parameters ii. over-fitting iii. complexity of the models iv. weighted average of the likelihood v. posterior probability vi. evidence That simple sign ʃ can imply 100 dimensional integrals! Models are more complex, Large prediction errors, Bayesianly, more parameters, more ʃ, more complex, larger prediction errors,. etc. Eventually, models are not preferred Approximation methods are used, say MCMC, or BIC (created immediately with AIC), or Laplace method,, etc. (web-search) 16

17 Bayes Factor Recall Posterior = Evidences * Prior / p(x) To compare two models, M0 and M1, we compare their posterior Posterior of M 0 = Evidence(M 0 ) * Prior(M 0 ) / p(x) Posterior of M 1 = Evidence(M 1 ) * Prior(M 1 ) / p(x) Since (i) p(x) is the same, and, (ii) further, of we assume two prior are equally likely, comparing posteriors becomes comparing evidence So, Bayes Factor is: B 01 = Evidence(M 0 ) / Evidence (M 1 ), or B 01 = (ʃpr(x/θ,m 0 ) * Pr(θ/M 0 ) dθ) / (ʃpr(x/θ,m 1 ) * Pr(θ/M 1 ) dθ) Bayesisanly, there is no preference between M 0 and M 1, unlike frequentist (which protect H 0 and try to reject it) But, Evidence (M 0 and M 1 ) need to ʃ over all nuisance parameters! 17

18 Bayes Factor can apply to many situations, some with many parameters, say Factor Analysis, SEM, etc, but some with very few parameters, say Bayesian t-test. Now, if (i) all nuisance parameters take mle (instead of ʃ ), Bayes Factor equals to Likelihood Ratio (and use LRT and then Classical approach) 18

19 Interpretations of Bayes Factor B01 Interpretation < 1/10 Substantially prefer M 1, more than 10 times as likely 1/10 ~ 1/3 Slightly prefer M 1, between 3 to 10 times as likely 1/3 ~ 3 Indifferent (within 3 times as likely) 3 ~ 10 Slightly prefer M 0, between 3 to 10 times as likely > 10 Substantially prefer M 0, more than 10 times as likely B 01 = Evidence(M 0 ) / Evidence(M 1 ); B 10 = Evidence(M 1 ) / Evidence(M 0 ); so, B 01 = 1 / B 10 Because of the reciprocal relationship, the strength of index is the same for B 01 and 1/ B 01, say 1/3 and 3 are the same.) 19

20 Bayesian t-test There are many Bayesian t-test, Rouder et al (2009) is only one convenient example. Rouder et al (2009) and Wetzels et al (2011) Reference: Rouder, J. N., Speckman P. L., Sun D., Morey R. D., & Iverson G. (2009) Bayesian t-tests for Accepting and Rejecting the Null Hypothesis. Psychonomic Bulletin & Review, 16, Wetzels, Ruud; Matzke, Dora; Lee, Michael D.; Rouder, Jeffrey N.; Iversion, Geoffrey J. and Wagenmakers, Eric-Jan (2011) Statistical Evidence in Experimental Psychology: An Empirical Comparison Using 855 t Tests. Perspectives on Psychological Science, 6(3) µ: mean of difference; σ: variance of difference δ=µ /σ: effect size of difference M 0 : δ= 0 (hence µ=0) (so called null) M 1 : δ ~ N(0, σ 2 2 δ ), σ δ is σ of δ (alternative) 20

21 Note: effect size δ gives a standard way to compare means of different populations, and researchers have an intrinsic scale about the ranges of effect sizes that applies broadly (Rouder et al, 2009) For M 0, δ= 0, and evidence under M 0 only take δ=0 For M 1, δ ~ N(0, σ δ 2 ) and evidence under M 1 will take automatic averaging of all δ in the whole range. So, one more integration under M 1 In the above example, we only need to input: n 1 =n 2 =7, t = 3.83, r = 0.707, results are: Scaled JZS Bayes Factor = 14.7 Scaled-Information Bayes Factor = 17.7 But, need to take the reciprocal of the above values, because the web-calculator calculate B10 instead of B01, so Scaled JZS Bayes Factor = 1/14.7 =

22 Scaled-Information Bayes Factor = 1/17.7 = We expect more packages in Bayesian applications in future, but the principles remain the same In most cases, Bayesian t-test are similar to usual t-test. But, in the marginal situations (e.g. p = 0.06), results can be different. 22

23 Summary (statistical term are different from daily English) Bayesian makes inferences based on data we observed (i.e. Pr(θ/X), posterior ), which is more natural. In doing so, we need to specify prior, which is mostly unbiased, or non-informative Posterior can have very complicated distributions to be solved analytically, which prohibit Bayesian approach and its packages Handling nuisance parameters is the major technical difficulty, but it automatically handle model complexity (give penalty towards complicated models) and give direct probability statement (Pr(Model/Data)). *** Important: Classical Pr(X/θ) Vs Bayesian Pr(θ/X) 23

24 Q&A Shing On LEUNG Hui Ping WU 24

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist

More information

Module 22: Bayesian Methods Lecture 9 A: Default prior selection

Module 22: Bayesian Methods Lecture 9 A: Default prior selection Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington Outline Jeffreys prior Unit information priors Empirical

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33 Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett

More information

Frequentist Statistics and Hypothesis Testing Spring

Frequentist Statistics and Hypothesis Testing Spring Frequentist Statistics and Hypothesis Testing 18.05 Spring 2018 http://xkcd.com/539/ Agenda Introduction to the frequentist way of life. What is a statistic? NHST ingredients; rejection regions Simple

More information

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Why uncertainty? Why should data mining care about uncertainty? We

More information

Another Statistical Paradox

Another Statistical Paradox Another Statistical Paradox Eric-Jan Wagenmakers 1, Michael Lee 2, Jeff Rouder 3, Richard Morey 4 1 University of Amsterdam 2 University of Califoria at Irvine 3 University of Missouri 4 University of

More information

g-priors for Linear Regression

g-priors for Linear Regression Stat60: Bayesian Modeling and Inference Lecture Date: March 15, 010 g-priors for Linear Regression Lecturer: Michael I. Jordan Scribe: Andrew H. Chan 1 Linear regression and g-priors In the last lecture,

More information

New Bayesian methods for model comparison

New Bayesian methods for model comparison Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

A simple two-sample Bayesian t-test for hypothesis testing

A simple two-sample Bayesian t-test for hypothesis testing A simple two-sample Bayesian t-test for hypothesis testing arxiv:159.2568v1 [stat.me] 8 Sep 215 Min Wang Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, USA and Guangying

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Bayesian Inference: Concept and Practice

Bayesian Inference: Concept and Practice Inference: Concept and Practice fundamentals Johan A. Elkink School of Politics & International Relations University College Dublin 5 June 2017 1 2 3 Bayes theorem In order to estimate the parameters of

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

Bayesian Analysis for Natural Language Processing Lecture 2

Bayesian Analysis for Natural Language Processing Lecture 2 Bayesian Analysis for Natural Language Processing Lecture 2 Shay Cohen February 4, 2013 Administrativia The class has a mailing list: coms-e6998-11@cs.columbia.edu Need two volunteers for leading a discussion

More information

Lecture 13 Fundamentals of Bayesian Inference

Lecture 13 Fundamentals of Bayesian Inference Lecture 13 Fundamentals of Bayesian Inference Dennis Sun Stats 253 August 11, 2014 Outline of Lecture 1 Bayesian Models 2 Modeling Correlations Using Bayes 3 The Universal Algorithm 4 BUGS 5 Wrapping Up

More information

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling Due: Tuesday, May 10, 2016, at 6pm (Submit via NYU Classes) Instructions: Your answers to the questions below, including

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Empirical Bayes, Hierarchical Bayes Mark Schmidt University of British Columbia Winter 2017 Admin Assignment 5: Due April 10. Project description on Piazza. Final details coming

More information

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007 Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.

More information

David Giles Bayesian Econometrics

David Giles Bayesian Econometrics 9. Model Selection - Theory David Giles Bayesian Econometrics One nice feature of the Bayesian analysis is that we can apply it to drawing inferences about entire models, not just parameters. Can't do

More information

A Very Brief Summary of Bayesian Inference, and Examples

A Very Brief Summary of Bayesian Inference, and Examples A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Bayesian Inference. Chapter 1. Introduction and basic concepts

Bayesian Inference. Chapter 1. Introduction and basic concepts Bayesian Inference Chapter 1. Introduction and basic concepts M. Concepción Ausín Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Time Series and Dynamic Models

Time Series and Dynamic Models Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Bayesian Inference. STA 121: Regression Analysis Artin Armagan

Bayesian Inference. STA 121: Regression Analysis Artin Armagan Bayesian Inference STA 121: Regression Analysis Artin Armagan Bayes Rule...s! Reverend Thomas Bayes Posterior Prior p(θ y) = p(y θ)p(θ)/p(y) Likelihood - Sampling Distribution Normalizing Constant: p(y

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 CS students: don t forget to re-register in CS-535D. Even if you just audit this course, please do register.

More information

ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing

ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing Robert Vanderbei Fall 2014 Slides last edited on November 24, 2014 http://www.princeton.edu/ rvdb Coin Tossing Example Consider two coins.

More information

Bayesian methods in economics and finance

Bayesian methods in economics and finance 1/26 Bayesian methods in economics and finance Linear regression: Bayesian model selection and sparsity priors Linear Regression 2/26 Linear regression Model for relationship between (several) independent

More information

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1 Chapter 4 HOMEWORK ASSIGNMENTS These homeworks may be modified as the semester progresses. It is your responsibility to keep up to date with the correctly assigned homeworks. There may be some errors in

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 20 Lecture 6 : Bayesian Inference

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Lecture 2: Statistical Decision Theory (Part I)

Lecture 2: Statistical Decision Theory (Part I) Lecture 2: Statistical Decision Theory (Part I) Hao Helen Zhang Hao Helen Zhang Lecture 2: Statistical Decision Theory (Part I) 1 / 35 Outline of This Note Part I: Statistics Decision Theory (from Statistical

More information

1 Hypothesis Testing and Model Selection

1 Hypothesis Testing and Model Selection A Short Course on Bayesian Inference (based on An Introduction to Bayesian Analysis: Theory and Methods by Ghosh, Delampady and Samanta) Module 6: From Chapter 6 of GDS 1 Hypothesis Testing and Model Selection

More information

Module 4: Bayesian Methods Lecture 9 A: Default prior selection. Outline

Module 4: Bayesian Methods Lecture 9 A: Default prior selection. Outline Module 4: Bayesian Methods Lecture 9 A: Default prior selection Peter Ho Departments of Statistics and Biostatistics University of Washington Outline Je reys prior Unit information priors Empirical Bayes

More information

Confidence Distribution

Confidence Distribution Confidence Distribution Xie and Singh (2013): Confidence distribution, the frequentist distribution estimator of a parameter: A Review Céline Cunen, 15/09/2014 Outline of Article Introduction The concept

More information

Computational Perception. Bayesian Inference

Computational Perception. Bayesian Inference Computational Perception 15-485/785 January 24, 2008 Bayesian Inference The process of probabilistic inference 1. define model of problem 2. derive posterior distributions and estimators 3. estimate parameters

More information

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio Estimation of reliability parameters from Experimental data (Parte 2) This lecture Life test (t 1,t 2,...,t n ) Estimate θ of f T t θ For example: λ of f T (t)= λe - λt Classical approach (frequentist

More information

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Theory of Maximum Likelihood Estimation. Konstantin Kashin Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical

More information

CS 340 Fall 2007: Homework 3

CS 340 Fall 2007: Homework 3 CS 34 Fall 27: Homework 3 1 Marginal likelihood for the Beta-Bernoulli model We showed that the marginal likelihood is the ratio of the normalizing constants: p(d) = B(α 1 + N 1, α + N ) B(α 1, α ) = Γ(α

More information

Integrated Objective Bayesian Estimation and Hypothesis Testing

Integrated Objective Bayesian Estimation and Hypothesis Testing Integrated Objective Bayesian Estimation and Hypothesis Testing José M. Bernardo Universitat de València, Spain jose.m.bernardo@uv.es 9th Valencia International Meeting on Bayesian Statistics Benidorm

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide

More information

Bayesian Information Criterion as a Practical Alternative to Null-Hypothesis Testing Michael E. J. Masson University of Victoria

Bayesian Information Criterion as a Practical Alternative to Null-Hypothesis Testing Michael E. J. Masson University of Victoria Bayesian Information Criterion as a Practical Alternative to Null-Hypothesis Testing Michael E. J. Masson University of Victoria Presented at the annual meeting of the Canadian Society for Brain, Behaviour,

More information

Statistical Inference

Statistical Inference Statistical Inference Classical and Bayesian Methods Class 5 AMS-UCSC Tue 24, 2012 Winter 2012. Session 1 (Class 5) AMS-132/206 Tue 24, 2012 1 / 11 Topics Topics We will talk about... 1 Confidence Intervals

More information

Model comparison: Deviance-based approaches

Model comparison: Deviance-based approaches Model comparison: Deviance-based approaches Patrick Breheny February 19 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/23 Model comparison Thus far, we have looked at residuals in a fairly

More information

(4) One-parameter models - Beta/binomial. ST440/550: Applied Bayesian Statistics

(4) One-parameter models - Beta/binomial. ST440/550: Applied Bayesian Statistics Estimating a proportion using the beta/binomial model A fundamental task in statistics is to estimate a proportion using a series of trials: What is the success probability of a new cancer treatment? What

More information

David Giles Bayesian Econometrics

David Giles Bayesian Econometrics David Giles Bayesian Econometrics 1. General Background 2. Constructing Prior Distributions 3. Properties of Bayes Estimators and Tests 4. Bayesian Analysis of the Multiple Regression Model 5. Bayesian

More information

Bios 6649: Clinical Trials - Statistical Design and Monitoring

Bios 6649: Clinical Trials - Statistical Design and Monitoring Bios 6649: Clinical Trials - Statistical Design and Monitoring Spring Semester 2015 John M. Kittelson Department of Biostatistics & nformatics Colorado School of Public Health University of Colorado Denver

More information

MAXIMUM LIKELIHOOD, SET ESTIMATION, MODEL CRITICISM

MAXIMUM LIKELIHOOD, SET ESTIMATION, MODEL CRITICISM Eco517 Fall 2004 C. Sims MAXIMUM LIKELIHOOD, SET ESTIMATION, MODEL CRITICISM 1. SOMETHING WE SHOULD ALREADY HAVE MENTIONED A t n (µ, Σ) distribution converges, as n, to a N(µ, Σ). Consider the univariate

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

Two-Sample Inferential Statistics

Two-Sample Inferential Statistics The t Test for Two Independent Samples 1 Two-Sample Inferential Statistics In an experiment there are two or more conditions One condition is often called the control condition in which the treatment is

More information

Data Analysis and Uncertainty Part 2: Estimation

Data Analysis and Uncertainty Part 2: Estimation Data Analysis and Uncertainty Part 2: Estimation Instructor: Sargur N. University at Buffalo The State University of New York srihari@cedar.buffalo.edu 1 Topics in Estimation 1. Estimation 2. Desirable

More information

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation Patrick Breheny February 8 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Large-scale testing is, of course, a big area and we could keep talking

More information

Maximum-Likelihood Estimation: Basic Ideas

Maximum-Likelihood Estimation: Basic Ideas Sociology 740 John Fox Lecture Notes Maximum-Likelihood Estimation: Basic Ideas Copyright 2014 by John Fox Maximum-Likelihood Estimation: Basic Ideas 1 I The method of maximum likelihood provides estimators

More information

Bayesian Inference for Normal Mean

Bayesian Inference for Normal Mean Al Nosedal. University of Toronto. November 18, 2015 Likelihood of Single Observation The conditional observation distribution of y µ is Normal with mean µ and variance σ 2, which is known. Its density

More information

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over Point estimation Suppose we are interested in the value of a parameter θ, for example the unknown bias of a coin. We have already seen how one may use the Bayesian method to reason about θ; namely, we

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample

More information

Sampling Distributions

Sampling Distributions Sampling Distributions Sampling Distribution of the Mean & Hypothesis Testing Remember sampling? Sampling Part 1 of definition Selecting a subset of the population to create a sample Generally random sampling

More information

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Matthew S. Johnson New York ASA Chapter Workshop CUNY Graduate Center New York, NY hspace1in December 17, 2009 December

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

Introduction to Bayesian Inference

Introduction to Bayesian Inference University of Pennsylvania EABCN Training School May 10, 2016 Bayesian Inference Ingredients of Bayesian Analysis: Likelihood function p(y φ) Prior density p(φ) Marginal data density p(y ) = p(y φ)p(φ)dφ

More information

STAT 425: Introduction to Bayesian Analysis

STAT 425: Introduction to Bayesian Analysis STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall 2017 1 / 10 Lecture 7: Prior Types Subjective

More information

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing Agenda Introduction to Estimation Point estimation Interval estimation Introduction to Hypothesis Testing Concepts en terminology

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

Hierarchical Models & Bayesian Model Selection

Hierarchical Models & Bayesian Model Selection Hierarchical Models & Bayesian Model Selection Geoffrey Roeder Departments of Computer Science and Statistics University of British Columbia Jan. 20, 2016 Contact information Please report any typos or

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Choosing among models

Choosing among models Eco 515 Fall 2014 Chris Sims Choosing among models September 18, 2014 c 2014 by Christopher A. Sims. This document is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported

More information

Mathematical Statistics

Mathematical Statistics Mathematical Statistics MAS 713 Chapter 8 Previous lecture: 1 Bayesian Inference 2 Decision theory 3 Bayesian Vs. Frequentist 4 Loss functions 5 Conjugate priors Any questions? Mathematical Statistics

More information

Bayesian tests of hypotheses

Bayesian tests of hypotheses Bayesian tests of hypotheses Christian P. Robert Université Paris-Dauphine, Paris & University of Warwick, Coventry Joint work with K. Kamary, K. Mengersen & J. Rousseau Outline Bayesian testing of hypotheses

More information

Bayesian Methods. David S. Rosenberg. New York University. March 20, 2018

Bayesian Methods. David S. Rosenberg. New York University. March 20, 2018 Bayesian Methods David S. Rosenberg New York University March 20, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 March 20, 2018 1 / 38 Contents 1 Classical Statistics 2 Bayesian

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2 Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate

More information

CS 188: Artificial Intelligence Spring Today

CS 188: Artificial Intelligence Spring Today CS 188: Artificial Intelligence Spring 2006 Lecture 9: Naïve Bayes 2/14/2006 Dan Klein UC Berkeley Many slides from either Stuart Russell or Andrew Moore Bayes rule Today Expectations and utilities Naïve

More information

Introduction to Machine Learning. Lecture 2

Introduction to Machine Learning. Lecture 2 Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for

More information

FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE

FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE Donald A. Pierce Oregon State Univ (Emeritus), RERF Hiroshima (Retired), Oregon Health Sciences Univ (Adjunct) Ruggero Bellio Univ of Udine For Perugia

More information

A BAYESIAN MATHEMATICAL STATISTICS PRIMER. José M. Bernardo Universitat de València, Spain

A BAYESIAN MATHEMATICAL STATISTICS PRIMER. José M. Bernardo Universitat de València, Spain A BAYESIAN MATHEMATICAL STATISTICS PRIMER José M. Bernardo Universitat de València, Spain jose.m.bernardo@uv.es Bayesian Statistics is typically taught, if at all, after a prior exposure to frequentist

More information

Objective Bayesian Hypothesis Testing

Objective Bayesian Hypothesis Testing Objective Bayesian Hypothesis Testing José M. Bernardo Universitat de València, Spain jose.m.bernardo@uv.es Statistical Science and Philosophy of Science London School of Economics (UK), June 21st, 2010

More information

Bayesian Inference. Introduction

Bayesian Inference. Introduction Bayesian Inference Introduction The frequentist approach to inference holds that probabilities are intrinsicially tied (unsurprisingly) to frequencies. This interpretation is actually quite natural. What,

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Preliminaries. Probabilities. Maximum Likelihood. Bayesian

More information

Part 4: Multi-parameter and normal models

Part 4: Multi-parameter and normal models Part 4: Multi-parameter and normal models 1 The normal model Perhaps the most useful (or utilized) probability model for data analysis is the normal distribution There are several reasons for this, e.g.,

More information

Probability and Estimation. Alan Moses

Probability and Estimation. Alan Moses Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

A Power Fallacy. 1 University of Amsterdam. 2 University of California Irvine. 3 University of Missouri. 4 University of Groningen

A Power Fallacy. 1 University of Amsterdam. 2 University of California Irvine. 3 University of Missouri. 4 University of Groningen A Power Fallacy 1 Running head: A POWER FALLACY A Power Fallacy Eric-Jan Wagenmakers 1, Josine Verhagen 1, Alexander Ly 1, Marjan Bakker 1, Michael Lee 2, Dora Matzke 1, Jeff Rouder 3, Richard Morey 4

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

Unobservable Parameter. Observed Random Sample. Calculate Posterior. Choosing Prior. Conjugate prior. population proportion, p prior:

Unobservable Parameter. Observed Random Sample. Calculate Posterior. Choosing Prior. Conjugate prior. population proportion, p prior: Pi Priors Unobservable Parameter population proportion, p prior: π ( p) Conjugate prior π ( p) ~ Beta( a, b) same PDF family exponential family only Posterior π ( p y) ~ Beta( a + y, b + n y) Observed

More information

Chapter 5. Bayesian Statistics

Chapter 5. Bayesian Statistics Chapter 5. Bayesian Statistics Principles of Bayesian Statistics Anything unknown is given a probability distribution, representing degrees of belief [subjective probability]. Degrees of belief [subjective

More information

Linear Models A linear model is defined by the expression

Linear Models A linear model is defined by the expression Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose

More information

the unification of statistics its uses in practice and its role in Objective Bayesian Analysis:

the unification of statistics its uses in practice and its role in Objective Bayesian Analysis: Objective Bayesian Analysis: its uses in practice and its role in the unification of statistics James O. Berger Duke University and the Statistical and Applied Mathematical Sciences Institute Allen T.

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a

More information

Week 8 Hour 1: More on polynomial fits. The AIC

Week 8 Hour 1: More on polynomial fits. The AIC Week 8 Hour 1: More on polynomial fits. The AIC Hour 2: Dummy Variables Hour 3: Interactions Stat 302 Notes. Week 8, Hour 3, Page 1 / 36 Interactions. So far we have extended simple regression in the following

More information

Part III. A Decision-Theoretic Approach and Bayesian testing

Part III. A Decision-Theoretic Approach and Bayesian testing Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

Bayesian Estimation An Informal Introduction

Bayesian Estimation An Informal Introduction Mary Parker, Bayesian Estimation An Informal Introduction page 1 of 8 Bayesian Estimation An Informal Introduction Example: I take a coin out of my pocket and I want to estimate the probability of heads

More information

Statistical Foundations:

Statistical Foundations: Statistical Foundations: t distributions, t-tests tests Psychology 790 Lecture #12 10/03/2006 Today sclass The t-distribution t ib ti in its full glory. Why we use it for nearly everything. Confidence

More information