MAS3301 Bayesian Statistics
|
|
- Alexandrina Carpenter
- 5 years ago
- Views:
Transcription
1 MAS3301 Bayesian Statistics M. Farrow School of Mathematics and Statistics Newcastle University Semester 2,
2 11 Conjugate Priors IV: The Dirichlet distribution and multinomial observations 11.1 The Dirichlet distribution The Dirichlet distribution is a distribution for a set of quantities θ 1,..., θ m where θ i 0 and m θ i = 1. An obvious application is to a set of probabilities for a partition (i.e. for an exhaustive set of mutually exclusive events). The probability density function is f(θ 1,..., θ m ) = Γ(a i) θ ai 1 i where A = m a i and a 1,..., a m are parameters with a i > 0 for i = 1,..., m. Clearly, if m = 2, we obtain a beta(a 1, a 2 ) distribution as a special case. The mean of θ j is the variance of θ j is var(θ j ) = and the covariance of θ j and θ k, where j k, is E(θ j ) = a j A a j A(A + 1) a 2 j A 2 (A + 1) covar(θ j, θ k ) = a ja k A 2 (A + 1). Also the marginal distribution of θ j is beta(a j, A a j ). Note that the space of the parameters θ 1,..., θ m has only m 1 dimensions because of the constraint m θ i = 1, so that, for example, θ m = 1 m 1 θ i. Therefore, when we integrate over this space, the integration has only m 1 dimensions. Proof (mean) The mean is E(θ j ) = = = θ j Γ(a i) Γ(a j + 1) Γ(A + 1) Γ(a j ) Γ(a j + 1) = a j Γ(A + 1) Γ(a j ) A θ ai 1 i dθ 1... dθ m 1 Γ(A + 1) m Γ(a i ) θ a i 1 i dθ 1... dθ m 1 where a i = a i when i j and a j = a j
3 Proof (variance) Similarly so E(θ 2 j ) = Γ(a j + 2) = (a j + 1)a j Γ(A + 2) Γ(a j ) (A + 1)A var(θ j ) = (a j + 1)a ( j (A + 1)A aj ) 2 a j = A A(A + 1) a 2 j A 2 (A + 1) Proof (covariance) Also so E(θ j θ k ) = Γ(A + 2) Γ(a j + 1) Γ(a k + 1) = a ja k Γ(a j ) Γ(a k ) (A + 1)A covar(θ j, θ k ) = a ja k (A + 1)A a j a k A A = a ja k A 2 (A + 1) Proof (marginal) We can write the joint density of θ 1,..., θ m as f 1 (θ 1 )f 2 (θ 2 θ 1 )f 3 (θ 3 θ 1, θ 2 ) f m 1 (θ m 1 θ 1,..., θ m 2 ). (We do not need to include a final term in this for θ m because θ m is fixed once θ 1,..., θ m 1 are fixed). In fact we can write the joint density as Γ(a 1 )Γ(A a 1 ) θa1 1 1 (1 θ 1 ) A a1 1 Γ(A a 1 ) θ a2 1 2 (1 θ 1 θ 2 ) A a1 a2 1 Γ(a 2 )Γ(A a 1 a 2 ) (1 θ 1 ) A a1 1 Γ(A a 1 a m 2 ) θ am 1 1 m 1 θm am 1 Γ(a m 1 )Γ(A a 1 a m 1 ) (1 θ 1 θ m 2 ) am 1+am 1. A bit of cancelling shows that this simplifies to the correct Dirichlet density. 70
4 Thus we can see that the marginal distribution of θ 1 is a beta(a 1, A a 1 ) distribution and similarly that the marginal distribution of θ j is a beta(a j, A a j ) distribution. We can also deduce the distribution of a subset of θ 1,..., θ m. For example if θ 3 = 1 θ 1 θ 2 θ 3, then the distribution of θ 1, θ 2, θ 3, θ 3 is Dirichlet(a 1, a 2, a 3, ã 3 ) where ã 3 = A a 1 a 2 a Multinomial observations Model Suppose that we will observe X 1,..., X m where these are the frequencies for categories 1,..., m, the total N = m X i is fixed and the probabilities for these categories are θ 1,..., θ m where m θ i = 1. Then, given θ, where θ = (θ 1,..., θ m ) T, the distribution of X 1,..., X m is multinomial with Pr(X 1 = x 1,..., X m = x m ) = N! x i! θ xi i. Notice that, with m = 2, this is just a binomial(n, θ 1 ) distribution. Then the likelihood is L(θ; x) = N! x i! θ xi i. The conjugate prior is a Dirichlet distribution which has a pdf proportional to The posterior pdf is proportional to θ ai 1 i θ ai 1 i. θ xi i = θ xi i θ ai+xi 1 i. This is proportional to the pdf of a Dirichlet distribution with parameters a 1 + x 1, a 2 + x 2,... a m + x m Example In a survey 1000 English voters are asked to say for which party they would vote if there were a general election next week. The choices offered were 1: Labour, 2: Liberal, 3: Conservative, 4: Other, 5: None, 6: Undecided. We assume that the population is large enough so that the responses may be considered independent given the true underlying proportions. Let θ 1,..., θ 6 be the probabilities that a randomly selected voter would give each of the responses. Our prior distribution for θ 1,..., θ 6 is a Dirichlet(5, 3, 5, 1, 2, 4) distribution. This gives the following summary of the prior distribution. Response a i Prior mean Prior var. Prior sd. Labour Liberal Conservative Other None Undecided Total Suppose our observed data are as follows. 71
5 Labour Liberal Conservative Other None Undecided Then we can summarise the posterior distribution as follows. Response a i + x i Posterior mean Posterior var. Posterior sd. Labour Liberal Conservative Other None Undecided Total
6 12 Sufficiency 12.1 Introduction Consider the following problem. We are going to observe two random variables X 1, X 2. In each case, given the value of µ, we have X i µ N(µ, V ) where the variance V is known but we wish to learn about the value of µ. Further, given µ, the two variables X 1, X 2 are independent. The likelihood comes from the joint pdf of X 1, X 2 but an exactly equivalent observation would be Y 1, Y 2 where It is easily seen that Y 1 = X 1 + X 2 Y 2 = X 1 X 2 Y 1 N(2µ, 2V ) Y 2 N(0, 2V ) and that Y 1 and Y 2 are independent. Therefore Y 2 does not depend on µ and its value can not tell us anything about µ. On the other hand the value of Y 1 tells us everything which we can learn from the data about µ. We say that Y 1 is sufficient for µ and Y 2 is ancillary for µ Definition Suppose we have an unknown (e.g. a parameter) θ and we will observe data Y. The density (or probability) of Y given θ is f Y θ (y θ) and this gives us the likelihood, L(θ; y). Suppose we have a statistic T (Y ), with value t. Since, once we know Y, we can calculate T, can always write f Y θ (y θ) = f Y,T θ (y, t θ) = f T θ (t θ)f Y t,θ (y t, θ). In some cases f Y t,θ (y t, θ) does not depend on θ so f Y t,θ (y t, θ) = f Y t (y t). In this case f Y θ (y θ) = f T θ (t θ)f Y t (y t). (9) In such a case we say that T (Y ) is a sufficient statistic for θ given Y. Often we simply say that T is sufficient for θ Factorisation theorem Another way to express (9) is to say that T is sufficient for θ if and only if there are functions g, h such that where h(y) does not depend on θ. This is known as Neyman s factorisation theorem. f Y θ (y θ) = g(θ, t)h(y) (10) Proof: If T is sufficient for θ then we can write g(θ, t) = f T θ (t θ) and h(y) = f Y t (y t). To prove the converse we start by integrating (or summing) (10) over all values of y where T (y) = t. This gives f T θ (t θ) = g(θ, t)h(t) 73
7 for some function H(t). This gives us which we substitute in (10) to obtain Now so g(θ, t) = f T θ(t θ) H(t) f Y θ (y θ) = f T θ(t θ)h(y). H(t) f Y t,θ (y t, θ) = f Y,T θ(y, t θ) f T θ (t θ) f Y t,θ (y t, θ) = h(y) H(t). = f Y θ(y θ) f T θ (t θ) The right hand side of this equation does not depend on θ so the theorem is proved Sufficiency principle From (9) we can see that, if T is sufficient for θ, then the likelihood for θ from y is proportional to the likelihood for θ from t. Therefore, instead of using the likelihood for the full data we can use the likelihood based simply on the distribution of T Examples Poisson Suppose that we observe random variables Y 1,..., Y n where, given the value of the parameter λ, Y i is independent of Y j for i j and Y i Poisson(λ) for i = 1,..., n. Then the likelihood is L(λ; y) = n e λ λ yi y i! = e nλ λ S n = g(λ, S)h(y) where S = n y i, g(λ, S) = e nλ λ S and h(y) = n 1 y i!. So S is sufficient for λ. Furthermore S Poisson(nλ) so an equivalent likelihood is 1 y i! L S (λ; y) = e nλ (nλ) S S! e nλ λ S. 74
8 Normal Suppose that we observe random variables Y 1,..., Y n where, given the value of the parameters µ, σ 2, Y i is independent of Y j for i j and Y i N(µ, σ 2 ) for i = 1,..., n. Here the parameter is θ = (µ, σ 2 ) T. The likelihood is n L(µ, σ 2 ; y) = (2πσ 2 ) 1/2 exp 1 } 2σ 2 (y i µ) 2 } = (2πσ 2 ) n/2 exp 1 n 2σ 2 (y i µ) 2 } = (2πσ 2 ) n/2 exp 1 n 2σ 2 (y i ȳ + ȳ µ) 2 [ = (2πσ 2 ) n/2 exp 1 n ]} 2σ 2 (y i ȳ) 2 + n(ȳ µ) 2 = (2πσ 2 ) n/2 exp 1 [ S + n(ȳ µ) 2 ] } 2σ 2 where h(y) = 1, T = (ȳ, S) T, = g(θ, T )h(y) ȳ = 1 n n y i and S = n (y i ȳ) 2. Hence ȳ and S are sufficient for µ and σ 2. Furthermore, in the case where σ 2 is known, ȳ is sufficient for µ since L(µ; y) = exp n 2σ 2 (ȳ µ)2} (2πσ 2 ) n/2 exp S } 2σ 2 = g(µ, ȳ)h(y) with h(y) = (2πσ 2 ) n/2 exp S } 2σ 2. 75
MAS3301 Bayesian Statistics
MAS3301 Bayesian Statistics M. Farrow School of Mathematics and Statistics Newcastle University Semester, 008-9 1 13 Sequential updating 13.1 Theory We have seen how we can change our beliefs about an
More informationMAS8303 Modern Bayesian Inference Part 2
MAS8303 Modern Bayesian Inference Part 2 M. Farrow School of Mathematics and Statistics Newcastle University Semester 1, 2012-13 2 Chapter 0 Inference for More Than One Unknown 0.1 More than one unknown
More informationMAS3301 Bayesian Statistics
MAS331 Bayesian Statistics M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 28-9 1 9 Conjugate Priors II: More uses of the beta distribution 9.1 Geometric observations 9.1.1
More informationMAS3301 Bayesian Statistics
MAS3301 Bayesian Statistics M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2008-9 1 15 Inference for Normal Distributions II 15.1 Student s t-distribution When we look
More informationMAS3301 Bayesian Statistics Problems 5 and Solutions
MAS3301 Bayesian Statistics Problems 5 and Solutions Semester 008-9 Problems 5 1. (Some of this question is also in Problems 4). I recorded the attendance of students at tutorials for a module. Suppose
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationCSC321 Lecture 18: Learning Probabilistic Models
CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling
More informationSummary of Chapters 7-9
Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two
More informationMAS3301 Bayesian Statistics Problems 2 and Solutions
MAS33 Bayesian Statistics Problems Solutions Semester 8-9 Problems Useful integrals: In solving these problems you might find the following useful Gamma functions: Let a b be positive Then where Γ(a) If
More informationBayesian Inference. Chapter 2: Conjugate models
Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in
More informationHT Introduction. P(X i = x i ) = e λ λ x i
MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework
More informationIntroduction to Bayesian Methods
Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative
More informationWeakness of Beta priors (or conjugate priors in general) They can only represent a limited range of prior beliefs. For example... There are no bimodal beta distributions (except when the modes are at 0
More informationHPD Intervals / Regions
HPD Intervals / Regions The HPD region will be an interval when the posterior is unimodal. If the posterior is multimodal, the HPD region might be a discontiguous set. Picture: The set {θ : θ (1.5, 3.9)
More informationSTA 260: Statistics and Probability II
Al Nosedal. University of Toronto. Winter 2017 1 Properties of Point Estimators and Methods of Estimation 2 3 If you can t explain it simply, you don t understand it well enough Albert Einstein. Definition
More information2.6.3 Generalized likelihood ratio tests
26 HYPOTHESIS TESTING 113 263 Generalized likelihood ratio tests When a UMP test does not exist, we usually use a generalized likelihood ratio test to verify H 0 : θ Θ against H 1 : θ Θ\Θ It can be used
More informationChapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1
Chapter 4 HOMEWORK ASSIGNMENTS These homeworks may be modified as the semester progresses. It is your responsibility to keep up to date with the correctly assigned homeworks. There may be some errors in
More informationWeakness of Beta priors (or conjugate priors in general) They can only represent a limited range of prior beliefs. For example... There are no bimodal beta distributions (except when the modes are at 0
More informationIntroduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models
Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Matthew S. Johnson New York ASA Chapter Workshop CUNY Graduate Center New York, NY hspace1in December 17, 2009 December
More informationDS-GA 1002 Lecture notes 11 Fall Bayesian statistics
DS-GA 100 Lecture notes 11 Fall 016 Bayesian statistics In the frequentist paradigm we model the data as realizations from a distribution that depends on deterministic parameters. In contrast, in Bayesian
More information(4) One-parameter models - Beta/binomial. ST440/550: Applied Bayesian Statistics
Estimating a proportion using the beta/binomial model A fundamental task in statistics is to estimate a proportion using a series of trials: What is the success probability of a new cancer treatment? What
More informationMathematical Statistics
Mathematical Statistics Chapter Three. Point Estimation 3.4 Uniformly Minimum Variance Unbiased Estimator(UMVUE) Criteria for Best Estimators MSE Criterion Let F = {p(x; θ) : θ Θ} be a parametric distribution
More informationLecture 23 Maximum Likelihood Estimation and Bayesian Inference
Lecture 23 Maximum Likelihood Estimation and Bayesian Inference Thais Paiva STA 111 - Summer 2013 Term II August 7, 2013 1 / 31 Thais Paiva STA 111 - Summer 2013 Term II Lecture 23, 08/07/2013 Lecture
More informationAdvanced topics from statistics
Advanced topics from statistics Anders Ringgaard Kristensen Advanced Herd Management Slide 1 Outline Covariance and correlation Random vectors and multivariate distributions The multinomial distribution
More informationRegression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood
Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing
More informationSOLUTION FOR HOMEWORK 6, STAT 6331
SOLUTION FOR HOMEWORK 6, STAT 633. Exerc.7.. It is given that X,...,X n is a sample from N(θ, σ ), and the Bayesian approach is used with Θ N(µ, τ ). The parameters σ, µ and τ are given. (a) Find the joinf
More informationIntroduction to Machine Learning. Lecture 2
Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for
More informationMultinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is
Multinomial Data The multinomial distribution is a generalization of the binomial for the situation in which each trial results in one and only one of several categories, as opposed to just two, as in
More informationConjugate Priors: Beta and Normal Spring 2018
Conjugate Priors: Beta and Normal 18.05 Spring 2018 Review: Continuous priors, discrete data Bent coin: unknown probability θ of heads. Prior f (θ) = 2θ on [0,1]. Data: heads on one toss. Question: Find
More informationFoundations of Statistical Inference
Foundations of Statistical Inference Jonathan Marchini Department of Statistics University of Oxford MT 2013 Jonathan Marchini (University of Oxford) BS2a MT 2013 1 / 27 Course arrangements Lectures M.2
More informationTopic 12 Overview of Estimation
Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationMultiparameter models (cont.)
Multiparameter models (cont.) Dr. Jarad Niemi STAT 544 - Iowa State University February 1, 2018 Jarad Niemi (STAT544@ISU) Multiparameter models (cont.) February 1, 2018 1 / 20 Outline Multinomial Multivariate
More informationAn Introduction to Bayesian Linear Regression
An Introduction to Bayesian Linear Regression APPM 5720: Bayesian Computation Fall 2018 A SIMPLE LINEAR MODEL Suppose that we observe explanatory variables x 1, x 2,..., x n and dependent variables y 1,
More informationMATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours
MATH2750 This question paper consists of 8 printed pages, each of which is identified by the reference MATH275. All calculators must carry an approval sticker issued by the School of Mathematics. c UNIVERSITY
More informationINTRODUCTION TO BAYESIAN METHODS II
INTRODUCTION TO BAYESIAN METHODS II Abstract. We will revisit point estimation and hypothesis testing from the Bayesian perspective.. Bayes estimators Let X = (X,..., X n ) be a random sample from the
More informationThree hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER.
Three hours To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER EXTREME VALUES AND FINANCIAL RISK Examiner: Answer QUESTION 1, QUESTION
More informationMonte Carlo in Bayesian Statistics
Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview
More informationConjugate Priors: Beta and Normal Spring 2018
Conjugate Priors: Beta and Normal 18.05 Spring 018 Review: Continuous priors, discrete data Bent coin: unknown probability θ of heads. Prior f (θ) = θ on [0,1]. Data: heads on one toss. Question: Find
More informationLikelihoods. P (Y = y) = f(y). For example, suppose Y has a geometric distribution on 1, 2,... with parameter p. Then the pmf is
Likelihoods The distribution of a random variable Y with a discrete sample space (e.g. a finite sample space or the integers) can be characterized by its probability mass function (pmf): P (Y = y) = f(y).
More informationModule 22: Bayesian Methods Lecture 9 A: Default prior selection
Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington Outline Jeffreys prior Unit information priors Empirical
More informationContents 1. Contents
Contents 1 Contents 6 Distributions of Functions of Random Variables 2 6.1 Transformation of Discrete r.v.s............. 3 6.2 Method of Distribution Functions............. 6 6.3 Method of Transformations................
More informationDepartment of Large Animal Sciences. Outline. Slide 2. Department of Large Animal Sciences. Slide 4. Department of Large Animal Sciences
Outline Advanced topics from statistics Anders Ringgaard Kristensen Covariance and correlation Random vectors and multivariate distributions The multinomial distribution The multivariate normal distribution
More information7. Estimation and hypothesis testing. Objective. Recommended reading
7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing
More informationThe binomial model. Assume a uniform prior distribution on p(θ). Write the pdf for this distribution.
The binomial model Example. After suspicious performance in the weekly soccer match, 37 mathematical sciences students, staff, and faculty were tested for the use of performance enhancing analytics. Let
More informationST 740: Linear Models and Multivariate Normal Inference
ST 740: Linear Models and Multivariate Normal Inference Alyson Wilson Department of Statistics North Carolina State University November 4, 2013 A. Wilson (NCSU STAT) Linear Models November 4, 2013 1 /
More informationDavid Giles Bayesian Econometrics
David Giles Bayesian Econometrics 1. General Background 2. Constructing Prior Distributions 3. Properties of Bayes Estimators and Tests 4. Bayesian Analysis of the Multiple Regression Model 5. Bayesian
More informationECE531 Lecture 8: Non-Random Parameter Estimation
ECE531 Lecture 8: Non-Random Parameter Estimation D. Richard Brown III Worcester Polytechnic Institute 19-March-2009 Worcester Polytechnic Institute D. Richard Brown III 19-March-2009 1 / 25 Introduction
More informationF & B Approaches to a simple model
A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys
More informationHypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33
Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett
More information1 Bayesian Linear Regression (BLR)
Statistical Techniques in Robotics (STR, S15) Lecture#10 (Wednesday, February 11) Lecturer: Byron Boots Gaussian Properties, Bayesian Linear Regression 1 Bayesian Linear Regression (BLR) In linear regression,
More informationConjugate Priors: Beta and Normal Spring January 1, /15
Conjugate Priors: Beta and Normal 18.05 Spring 2014 January 1, 2017 1 /15 Review: Continuous priors, discrete data Bent coin: unknown probability θ of heads. Prior f (θ) = 2θ on [0,1]. Data: heads on one
More informationStat 5101 Notes: Algorithms
Stat 5101 Notes: Algorithms Charles J. Geyer January 22, 2016 Contents 1 Calculating an Expectation or a Probability 3 1.1 From a PMF........................... 3 1.2 From a PDF...........................
More informationNotes on the Multivariate Normal and Related Topics
Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions
More informationEstimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator
Estimation Theory Estimation theory deals with finding numerical values of interesting parameters from given set of data. We start with formulating a family of models that could describe how the data were
More informationIntroduction to Probabilistic Machine Learning
Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning
More informationFirst Year Examination Department of Statistics, University of Florida
First Year Examination Department of Statistics, University of Florida August 19, 010, 8:00 am - 1:00 noon Instructions: 1. You have four hours to answer questions in this examination.. You must show your
More informationStats 579 Intermediate Bayesian Modeling. Assignment # 2 Solutions
Stats 579 Intermediate Bayesian Modeling Assignment # 2 Solutions 1. Let w Gy) with y a vector having density fy θ) and G having a differentiable inverse function. Find the density of w in general and
More informationPMR Learning as Inference
Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationHypothesis Testing: The Generalized Likelihood Ratio Test
Hypothesis Testing: The Generalized Likelihood Ratio Test Consider testing the hypotheses H 0 : θ Θ 0 H 1 : θ Θ \ Θ 0 Definition: The Generalized Likelihood Ratio (GLR Let L(θ be a likelihood for a random
More informationRemarks on Improper Ignorance Priors
As a limit of proper priors Remarks on Improper Ignorance Priors Two caveats relating to computations with improper priors, based on their relationship with finitely-additive, but not countably-additive
More informationICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts
ICML 2015 Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes Machine Learning Research Group and Oxford-Man Institute University of Oxford July 8, 2015 Point Processes
More informationRegression Estimation Least Squares and Maximum Likelihood
Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 1 Least Squares Max(min)imization Function to minimize
More informationNuisance parameters and their treatment
BS2 Statistical Inference, Lecture 2, Hilary Term 2008 April 2, 2008 Ancillarity Inference principles Completeness A statistic A = a(x ) is said to be ancillary if (i) The distribution of A does not depend
More informationBayesian Inference for Normal Mean
Al Nosedal. University of Toronto. November 18, 2015 Likelihood of Single Observation The conditional observation distribution of y µ is Normal with mean µ and variance σ 2, which is known. Its density
More information2016 SISG Module 17: Bayesian Statistics for Genetics Lecture 3: Binomial Sampling
2016 SISG Module 17: Bayesian Statistics for Genetics Lecture 3: Binomial Sampling Jon Wakefield Departments of Statistics and Biostatistics University of Washington Outline Introduction and Motivating
More informationMachine Learning. Lecture 3: Logistic Regression. Feng Li.
Machine Learning Lecture 3: Logistic Regression Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2016 Logistic Regression Classification
More informationBayesian Inference. Chapter 9. Linear models and regression
Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering
More informationCS 540: Machine Learning Lecture 2: Review of Probability & Statistics
CS 540: Machine Learning Lecture 2: Review of Probability & Statistics AD January 2008 AD () January 2008 1 / 35 Outline Probability theory (PRML, Section 1.2) Statistics (PRML, Sections 2.1-2.4) AD ()
More informationThe Bayesian Paradigm
Stat 200 The Bayesian Paradigm Friday March 2nd The Bayesian Paradigm can be seen in some ways as an extra step in the modelling world just as parametric modelling is. We have seen how we could use probabilistic
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationECE531 Homework Assignment Number 6 Solution
ECE53 Homework Assignment Number 6 Solution Due by 8:5pm on Wednesday 3-Mar- Make sure your reasoning and work are clear to receive full credit for each problem.. 6 points. Suppose you have a scalar random
More informationNew Bayesian methods for model comparison
Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison
More informationLecture 7 October 13
STATS 300A: Theory of Statistics Fall 2015 Lecture 7 October 13 Lecturer: Lester Mackey Scribe: Jing Miao and Xiuyuan Lu 7.1 Recap So far, we have investigated various criteria for optimal inference. We
More informationLecture 5. i=1 xi. Ω h(x,y)f X Θ(y θ)µ Θ (dθ) = dµ Θ X
LECURE NOES 25 Lecture 5 9. Minimal sufficient and complete statistics We introduced the notion of sufficient statistics in order to have a function of the data that contains all information about the
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationStatistics STAT:5100 (22S:193), Fall Sample Final Exam B
Statistics STAT:5 (22S:93), Fall 25 Sample Final Exam B Please write your answers in the exam books provided.. Let X, Y, and Y 2 be independent random variables with X N(µ X, σ 2 X ) and Y i N(µ Y, σ 2
More informationThe Normal Linear Regression Model with Natural Conjugate Prior. March 7, 2016
The Normal Linear Regression Model with Natural Conjugate Prior March 7, 2016 The Normal Linear Regression Model with Natural Conjugate Prior The plan Estimate simple regression model using Bayesian methods
More informationChapter 3: Maximum Likelihood Theory
Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood
More information(1) Introduction to Bayesian statistics
Spring, 2018 A motivating example Student 1 will write down a number and then flip a coin If the flip is heads, they will honestly tell student 2 if the number is even or odd If the flip is tails, they
More informationUSEFUL PROPERTIES OF THE MULTIVARIATE NORMAL*
USEFUL PROPERTIES OF THE MULTIVARIATE NORMAL* 3 Conditionals and marginals For Bayesian analysis it is very useful to understand how to write joint, marginal, and conditional distributions for the multivariate
More informationFoundations of Statistical Inference
Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 32 Lecture 14 : Variational Bayes
More informationECE531 Lecture 10b: Maximum Likelihood Estimation
ECE531 Lecture 10b: Maximum Likelihood Estimation D. Richard Brown III Worcester Polytechnic Institute 05-Apr-2011 Worcester Polytechnic Institute D. Richard Brown III 05-Apr-2011 1 / 23 Introduction So
More informationSTAT Advanced Bayesian Inference
1 / 32 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics Jan 23, 218 The Dirichlet distribution 2 / 32 θ Dirichlet(a 1,...,a k ) with density p(θ 1,θ 2,...,θ k ) = k j=1 Γ(a j) Γ(
More information2018 SISG Module 20: Bayesian Statistics for Genetics Lecture 2: Review of Probability and Bayes Theorem
2018 SISG Module 20: Bayesian Statistics for Genetics Lecture 2: Review of Probability and Bayes Theorem Jon Wakefield Departments of Statistics and Biostatistics University of Washington Outline Introduction
More informationIntroduction to Bayesian Statistics. James Swain University of Alabama in Huntsville ISEEM Department
Introduction to Bayesian Statistics James Swain University of Alabama in Huntsville ISEEM Department Author Introduction James J. Swain is Professor of Industrial and Systems Engineering Management at
More informationExercises and Answers to Chapter 1
Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean
More informationCompute f(x θ)f(θ) dθ
Bayesian Updating: Continuous Priors 18.05 Spring 2014 b a Compute f(x θ)f(θ) dθ January 1, 2017 1 /26 Beta distribution Beta(a, b) has density (a + b 1)! f (θ) = θ a 1 (1 θ) b 1 (a 1)!(b 1)! http://mathlets.org/mathlets/beta-distribution/
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationAdvanced Statistical Modelling
Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1
More informationLatent Variable Models Probabilistic Models in the Study of Language Day 4
Latent Variable Models Probabilistic Models in the Study of Language Day 4 Roger Levy UC San Diego Department of Linguistics Preamble: plate notation for graphical models Here is the kind of hierarchical
More informationBayesian SAE using Complex Survey Data Lecture 1: Bayesian Statistics
Bayesian SAE using Complex Survey Data Lecture 1: Bayesian Statistics Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 101 Outline Motivation Bayesian Learning Probability
More informationThe Delta Method and Applications
Chapter 5 The Delta Method and Applications 5.1 Local linear approximations Suppose that a particular random sequence converges in distribution to a particular constant. The idea of using a first-order
More informationPractice Examination # 3
Practice Examination # 3 Sta 23: Probability December 13, 212 This is a closed-book exam so do not refer to your notes, the text, or any other books (please put them on the floor). You may use a single
More informationStat 5101 Lecture Slides: Deck 7 Asymptotics, also called Large Sample Theory. Charles J. Geyer School of Statistics University of Minnesota
Stat 5101 Lecture Slides: Deck 7 Asymptotics, also called Large Sample Theory Charles J. Geyer School of Statistics University of Minnesota 1 Asymptotic Approximation The last big subject in probability
More informationLecture 4: Exponential family of distributions and generalized linear model (GLM) (Draft: version 0.9.2)
Lectures on Machine Learning (Fall 2017) Hyeong In Choi Seoul National University Lecture 4: Exponential family of distributions and generalized linear model (GLM) (Draft: version 0.9.2) Topics to be covered:
More information2 Belief, probability and exchangeability
2 Belief, probability and exchangeability We first discuss what properties a reasonable belief function should have, and show that probabilities have these properties. Then, we review the basic machinery
More informationSTAT J535: Chapter 5: Classes of Bayesian Priors
STAT J535: Chapter 5: Classes of Bayesian Priors David B. Hitchcock E-Mail: hitchcock@stat.sc.edu Spring 2012 The Bayesian Prior A prior distribution must be specified in a Bayesian analysis. The choice
More informationChapter 3 : Likelihood function and inference
Chapter 3 : Likelihood function and inference 4 Likelihood function and inference The likelihood Information and curvature Sufficiency and ancilarity Maximum likelihood estimation Non-regular models EM
More information