# Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007

Save this PDF as:

Size: px
Start display at page:

Download "Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007"

## Transcription

1 Measurement Error and Linear Regression of Astronomical Data Brandon Kelly Penn State Summer School in Astrostatistics, June 2007

2 Classical Regression Model Collect n data points, denote i th pair as (η i, ξ i ), where η is the dependent variable (or response ), and ξ is the independent variable (or predictor, covariate ) Assume usual additive error model: " = # + \$% + & i i i E(& ) = 0 i Var(& ) = E(& 2 ) = ' 2 i i

3 Ordinary Least Squares (OLS) slope estimate is " ˆ = Cov(#,\$) Var(\$) Wait, that s not in Bevington The error term, ε, encompasses real physical variations in source properties, i.e., the intrinsic scatter.

4 Example: Photoionization Physics in Broad Line AGN Test if distance between BLR and continuum source set be photoionization physics From definition of ionization parameter, U: L R 2 = ion 4"c 2 Un e E logr = 1 2 log L ion # 1 2 log(4"c 2 ) # 1 2 (logu + logn e + log E ) logr = \$ + % log L ion + & % =1 2 \$ = # 1 2 [E(logU + logn e + log E ) + log(4"c 2 )] & = # 1 2 [(logu + logn e + log E ) # E(logU + log n e + log E )]

5 Bentz et al., 2006, ApJ, 644, 133 BLR Size vs Luminosity, uncorrected for host galaxy starlight (top) and corrected for starlight (bottom). Some scatter due to measurement errors, but some due to intrinsic variations.

6 Measurement Errors Don t observe (η,ξ), but measured values (y,x) instead. Measurement errors add an additional level to the statistical model: " i = # + \$% i + & i, E(& i ) = 0, E(& i 2 ) = ' 2 x = % + &, E(& ) = 0, E(& 2 ) = ' 2 i i x,i x,i x,i x,i y = " + &, E(& ) = 0, E(& 2 ) = ' 2 i i y,i y,i y,i y,i E(& x,i & y,i ) = ' xy,i

7 Different Types of Measurement Error Produced by measuring instrument CCD Read Noise, Dark Current Poisson, Counting Errors Uncertainty in photon count rate creates measurement error on flux. Quantities inferred from fitting a parametric model Using a spectral model to measure flux density Using observable as proxies for unobservables Using stellar velocity dispersion to measure black hole mass, using galaxy flux to measure star formation rate Measurement error set by intrinsic source variations, won t decrease with better instruments/bigger telescopes

8 Measurement errors alter the moments of the joint distribution of the response and covariate, bias the correlation coefficient and slope estimate ˆ b = Cov(x, y) Var(x) = Cov(",#) + \$ xy Var(") + \$ x 2 ˆ r = Cov(x, y) [Var(x)Var(y)] 1/ 2 = Cov(",#) + \$ xy [Var(") + \$ x 2 ] 1/ 2 [Var(#) + \$ y 2 ] 1/ 2 If measurement errors are uncorrelated, regression slope unaffected by measurement error in the response If errors uncorrelated, regression slope and correlation biased toward zero (attenuated).

9 Degree of Bias Depends on Magnitude of Measurement Error Define ratios of measurement error variance to observed variance: R X = R XY = " 2 x Var(x), R Y = " y Var(y), " xy Cov(x, y) Bias can then be expressed as: b ˆ ˆ " = 1# R X, 1# R XY 2 r ˆ ˆ \$ = (1# R X ) 1/ 2 (1# R Y ) 1/ 2 1# R XY

10 True Distribution Measured Distribution

11 BCES Estimator BCES Approach (Akritas & Bershady, ApJ, 1996, 470, 706; see also Fuller 1987, Measurement Error Models) is to debias the moments: ˆ " BCES = Cov(x, y) #\$ xy Var(x) #\$ x 2, ˆ % BCES = y # ˆ " BCES x Also give estimator for bisector and orthogonal regression slopes Asymptotically normal, variance in coefficients can be estimated from the data Variance of measurement error can depend on measured value

12 Advantages vs Disadvantages Asymptotically unbiased and normal, variance in coefficients can be estimated from the data Variance of measurement error can depend on measured value Easy and fast to compute Can be unstable, highly variable for small samples and/or large measurement errors Can be biased for small samples and/or large measurement errors Convergence to asymptotic limit may be slow, depends on size of measurement errors

13 FITEXY Estimator Press et al.(1992, Numerical Recipes) define an effective χ 2 statistic: 2 " EXY = n ' i=1 (y i #\$ # %x i ) 2 & 2 y,i + % 2 2 & x,i Choose values of α and β that minimize χ 2 EXY Modified by Tremaine et al.(2002, ApJ, 574, 740), to account for intrinsic scatter: 2 " EXY = n ' i=1 (y i #\$ # %x i ) 2 & 2 + & 2 y,i + % 2 2 & x,i

14 Advantages vs Disadvantages Simulations suggest that FITEXY is less variable than BCES under certain conditions Fairly easy to calculate for a given value of σ 2 Can t be minimized simultaneously with respect to α,β, and σ 2, ad hoc procedure often used Statistical properties poorly understood Simulations suggest FITEXY can be biased, more variable than BCES Not really a true χ 2, can t use Δχ 2 =1 to find confidence regions Only constructed for uncorrelated measurement errors

15 Structural Approach Regard ξ and η as missing data, random variables with some assumed probability distribution Derive a complete data likelihood: p(x, y,",# \$,%) = p(x,y ",#)p(# ",\$) p(" %) Integrate over the missing data, ξ and η, to obtain the observed (measured) data likelihood function && p(x, y ",#) = p(x, y,\$,% ",#) d\$ d%

16 Mixture of Normals Model Model the distribution of ξ as a mixture of K Gaussians, assume Gaussian intrinsic scatter and Gaussian measurement errors of known variance The model is hierarchically expressed as: K % k=1 " i #,µ,\$ 2 ~ # k N(µ k,\$ k 2 ) & i " i,',(,) 2 ~ N(' + (" i,) 2 ) y i,x i & i," i ~ N([& i," i ],* i ) - + = (#,µ,\$ 2 ),, = (',(,) 2 ), * i = ) 2 y,i ) xy,i 0 / 2 2.) xy,i ) x,i 1 References: Kelly (2007, ApJ in press, arxiv: ), Carroll et al.(1999, Biometrics, 55, 44)

17 Integrate complete data likelihood to obtain observed data likelihood: n ' i=1 && p(x, y ",#) = p(x i, y i \$ i,% i )p(% i \$ i,")p(\$ i #) d\$ i d% i z i = y i x i = ( ) T n ' i=1 K + k=1 * k = (, + -µ k µ k ) T ( k 0 V k,i = k + / / y,i -. 2 k + / xy,i k + / xy,i k + / x,i 4 { } 2( V exp ) 1 (z )* 1/ 2 2 i k )T V )1 k,i (z i )* k ) k,i Can be used to calculate a maximum-likelihood estimate (MLE), perform Bayesian inference. See Kelly (2007) for generalization to multiple covariates.

18 What if we assume that ξ has a uniform distribution? The likelihood for uniform p(ξ) can be obtained in the limit τ : p(x, y ") # n. i=1 \$ 2 + \$ 2 + % 2 \$ 2 ( y,i ) & 1 ( 2 exp )& 1 x,i * 2 (y i &' & %x i ) 2 \$ 2 + \$ 2 y,i + % 2 2 \$ x,i Argument of exponential is χ 2 EXY statistic! But minimizing χ 2 EXY is not the same as maximizing the likelihood For homoskedastic measurement errors, maximizing this likelihood leads to the ordinary least-squares estimate, so still biased. +, -

19 Advantages vs Disadvantages Simulations suggest MLE for normal mixture approximately unbiased, even for small samples and large measurement error Lower variance than BCES and FITEXY MLE for σ 2 rarely, if ever, equal to zero Bayesian inference can be performed, valid for any sample size Can be extended to handle non-detections, truncation, multiple covariates Simultaneously estimates distribution of covariates, may be useful for some studies Computationally intensive, more complicated than BCES and FITEXY Assumes measurement error variance known, does not depend on measurement Needs a parametric form for the intrinsic distribution of covariates (but mixture model flexible and fairly robust).

20 Simulation Study: Slope Dashed lines mark the median value of the estimator, solid lines mark the true value of the slope. Each simulated data set had 50 data points, and y-measurement errors of σ y ~ σ.

21 Simulation Study: Intrinsic Dispersion Dashed lines mark the median value of the estimator, solid lines mark the true value of σ. Each simulated data set had 50 data points, and y-measurement errors of σ y ~ σ.

22 Effects of Sample Selection Suppose we select a sample of n sources, out of a total of N possible sources Introduce indicator variable, I, denoting whether a source is included. I i = 1 if the i th source is included, otherwise I i = 0. Selection function for i th source is p(i i =1 y i,x i ) Selection function gives the probability that the i th source is included in the sample, given the values of x i and y i. See Little & Rubin (2002, Statistical Analysis with Missing Data) or Gelman et al.(2004, Bayesian Data Analysis) for further reading.

23 Complete data likelihood is ' p(x, y,",#,i \$,%,N) & N * ),. p(x ( n i,y i," i,# i,i i \$,%). p(x j, y j," j,# j,i j \$,%) + i-a obs j -A mis Here, A obs denotes the set of n included sources, and A mis denotes the set of N-n sources not included Binomial coefficient gives number of possible ways to select a subset of n sources from a sample of N sources Integrating over the missing data, the observed data likelihood now becomes % p(x obs,y obs,i ",#,N) \$ N ( ' * p(x i,y i ",#) & n ), i+a obs -,.. p(i j = 0 x j,y j ) p(x j, y j ",#) dx j dy j j +A mis

24 Selection only dependent on covariates If the sample selection only depends on the covariates, then p(i x,y) = p(i x). Observed data likelihood is then % p(x obs, y obs,i ",#,N) \$ N ( ' *, p(x & n i, y i ",#) ) i+a obs -,.. p(i j = 0 x j ) p(y j x j,",#) p(x j #) dx j dy j j +A mis % \$ N ( ' *, p(x & n i,y i I i =1,",# obs ),. p(x j I j = 0,# mis )p(i j = 0 # mis ) dx j ) i+a obs j +A mis Therefore, if selection is only on the covariates, then inference on the regression parameters is unaffected by selection effects.

25 Selection depends on dependent variable: truncation Take Bayesian approach, posterior is p(",#,n x obs, y obs,i) \$ p(",#,n)p(x obs,y obs,i ",#,N) Assume uniform prior on logn, p(θ,ψ,n) N -1 p(θ,ψ) Since we don t care about N, sum posterior over n < N < p(",# x obs,y obs,i) \$ [ p(i =1 ",#) ] %n p(x i,y i ",#) p(i =1 ",#) = (( ' i&a obs p(i =1 x, y)p(x,y ",#) dx dy

26 Covariate Selection vs Response Selection Covariate Selection: No effect on distribution of y at a given x Response Selection: Changes distribution of y at a given x

27 Non-detections: Censored Data Introduce additional indicator variable, D, denoting whether a data point is detected or not: D=1 if detected. Assuming selection function independent of response, observed data likelihood becomes p(x, y,d ",#) \$ p(x i,y i ",#) & & i%a det ( ' p(x j #) p(d j = 0 y j,x j )p(y j x j,",#) dy j j %A cens A det denotes set of detected sources, A cens denotes set of censored sources. Equations for p(x ψ) and p(y x,θ,ψ) under the mixture of normals models are given in Kelly (2007).

28 Bayesian Inference Calculate posterior probability density for mixture of normals structural model Posterior valid for any sample size, doesn t rely on large sample approximations (e.g., asymptotic distribution of MLE). Assume prior advocated by Carroll et al.(1999) Use markov chain monte carlo (MCMC) to obtain draws from posterior

29 Gibbs Sampler Method for obtaining draws from posterior, easy to program Start with initial guesses for all unknowns Proceed in three steps: Simulate new values of missing data, including nondetections, given the observed data and current values of the parameters Simulate new values of the parameters, given the current values of the prior parameters and the missing data Simulate new values of the prior parameters, given the current parameter values. Save random draws at each iterations, repeat until convergence. Treat values from latter part of Gibbs sampler as random draw form the posterior Details given in Kelly (2007). Computer routines available at IDL Astronomy User s Library,

30 Example: Simulated Data Set with Non-detections Solid line is true regression line, Dashed-dotted is posterior Median, shaded region contains Approximately 95% of posterior probability Filled Squares: Detected data points Hollow Squares: Undetected data points

31 Posterior from MCMC for simulated data with non-detections Solid vertical line marks true value. For comparison, a naïve MLE that ignores meas. error found " MLE = ± This is biased toward zero at a level of 3.5σ

32 Example: Dependence of Quasar X-ray Spectral Slope on Eddington Ratio Solid line is posterior median, Shaded region contains 95% Of posterior probability.

33 Posterior for Quasar Spectral Slope vs Eddington Ratio For Comparison: ˆ " OLS = 0.56 ± 0.14 ˆ " BCES = 3.29 ± 3.34 ˆ " EXY =1.76 ± 0.49 ˆ # OLS = 0.41 ˆ # BCES = 0.32 ˆ # EXY = 0.0

34 References Akritas & Bershady, 1996, ApJ, 470, 706 Carroll, Roeder, & Wasserman, 1999, Biometrics, 55, 44 Carroll, Ruppert, & Stefanski, 1995, Measurement Error in Nonlinear Models (London: Chapman & Hall) Fuller, 1987, Measurement Error Models, (New York: John Wiley & Sons) Gelman et al., 2004, Bayesian Data Analysis, (2nd Edition; Boca Raton: Chapman Hall/CRC) Isobe, Feigelson, & Nelson, 1986, ApJ, 306, 490 Kelly, B.C, 2007, in press at ApJ (arxiv: ) Little & Rubin, 2002, Statistical Analysis with Missing Data (2nd Edition; Hoboken: John Wiley & Sons) Press et al., 1992, Numerical Recipes, (2nd Edition; Cambridge: Cambridge Unv. Press) Schafer, 1987, Biometrica, 74, 385 Schafer, 2001, Biometrics, 57, 53

### SOME ASPECTS OF MEASUREMENT ERROR IN LINEAR REGRESSION OF ASTRONOMICAL DATA

The Astrophysical Journal, 665:1489 1506, 007 August 0 # 007. The American Astronomical Society. All rights reserved. Printed in U.S.A. SOME ASPECTS OF MEASUREMENT ERROR IN LINEAR REGRESSION OF ASTRONOMICAL

### Bagging During Markov Chain Monte Carlo for Smoother Predictions

Bagging During Markov Chain Monte Carlo for Smoother Predictions Herbert K. H. Lee University of California, Santa Cruz Abstract: Making good predictions from noisy data is a challenging problem. Methods

### Hierarchical Bayesian Modeling

Hierarchical Bayesian Modeling Making scientific inferences about a population based on many individuals Angie Wolfgang NSF Postdoctoral Fellow, Penn State Astronomical Populations Once we discover an

### Monte Carlo Simulations

Monte Carlo Simulations What are Monte Carlo Simulations and why ones them? Pseudo Random Number generators Creating a realization of a general PDF The Bootstrap approach A real life example: LOFAR simulations

### A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type

### Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester

Physics 403 Propagation of Uncertainties Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Maximum Likelihood and Minimum Least Squares Uncertainty Intervals

### Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of

### Bayesian Estimation of log N log S

Bayesian Estimation of log N log S Paul Baines, Irina Udaltsova Department of Statistics University of California, Davis July 12th, 2011 Introduction What is log N log S? Cumulative number of sources detectable

### σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

### Nonparametric Estimation of Luminosity Functions

x x Nonparametric Estimation of Luminosity Functions Chad Schafer Department of Statistics, Carnegie Mellon University cschafer@stat.cmu.edu 1 Luminosity Functions The luminosity function gives the number

### Statistical Applications in the Astronomy Literature II Jogesh Babu. Center for Astrostatistics PennState University, USA

Statistical Applications in the Astronomy Literature II Jogesh Babu Center for Astrostatistics PennState University, USA 1 The likelihood ratio test (LRT) and the related F-test Protassov et al. (2002,

### Tools for Parameter Estimation and Propagation of Uncertainty

Tools for Parameter Estimation and Propagation of Uncertainty Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM 87801 borchers@nmt.edu Outline Models, parameters, parameter estimation,

### Simple Linear Regression for the MPG Data

Simple Linear Regression for the MPG Data 2000 2500 3000 3500 15 20 25 30 35 40 45 Wgt MPG What do we do with the data? y i = MPG of i th car x i = Weight of i th car i =1,...,n n = Sample Size Exploratory

### Bayesian Linear Regression

Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

### Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

### Robust and accurate inference via a mixture of Gaussian and terrors

Robust and accurate inference via a mixture of Gaussian and terrors Hyungsuk Tak ICTS/SAMSI Time Series Analysis for Synoptic Surveys and Gravitational Wave Astronomy 22 March 2017 Joint work with Justin

### Generalized Linear Models

Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.

### Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

### STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions

ASTR 511/O Connell Lec 6 1 STATISTICS OF OBSERVATIONS & SAMPLING THEORY References: Bevington Data Reduction & Error Analysis for the Physical Sciences LLM: Appendix B Warning: the introductory literature

### Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

### Statistical Inference and Methods

Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 31st January 2006 Part VI Session 6: Filtering and Time to Event Data Session 6: Filtering and

### Measurement error modeling. Department of Statistical Sciences Università degli Studi Padova

Measurement error modeling Statistisches Beratungslabor Institut für Statistik Ludwig Maximilians Department of Statistical Sciences Università degli Studi Padova 29.4.2010 Overview 1 and Misclassification

### Characterizing Forecast Uncertainty Prediction Intervals. The estimated AR (and VAR) models generate point forecasts of y t+s, y ˆ

Characterizing Forecast Uncertainty Prediction Intervals The estimated AR (and VAR) models generate point forecasts of y t+s, y ˆ t + s, t. Under our assumptions the point forecasts are asymtotically unbiased

### A Comparative Study of Imputation Methods for Estimation of Missing Values of Per Capita Expenditure in Central Java

IOP Conference Series: Earth and Environmental Science PAPER OPEN ACCESS A Comparative Study of Imputation Methods for Estimation of Missing Values of Per Capita Expenditure in Central Java To cite this

### ACTIVE GALACTIC NUCLEI: optical spectroscopy. From AGN classification to Black Hole mass estimation

ACTIVE GALACTIC NUCLEI: optical spectroscopy From AGN classification to Black Hole mass estimation Second Lecture Reverberation Mapping experiments & virial BH masses estimations Estimating AGN black hole

### Fast Likelihood-Free Inference via Bayesian Optimization

Fast Likelihood-Free Inference via Bayesian Optimization Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology

### eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

### Introduction to Error Analysis

Introduction to Error Analysis Part 1: the Basics Andrei Gritsan based on lectures by Petar Maksimović February 1, 2010 Overview Definitions Reporting results and rounding Accuracy vs precision systematic

### Density Estimation. Seungjin Choi

Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

### Parametric Techniques

Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

### Exponential Families

Exponential Families David M. Blei 1 Introduction We discuss the exponential family, a very flexible family of distributions. Most distributions that you have heard of are in the exponential family. Bernoulli,

### arxiv: v1 [astro-ph.im] 23 Oct 2012

Measurement errors and scaling relations in astrophysics: a review arxiv:1210.6232v1 [astro-ph.im] 23 Oct 2012 S. Andreon, 1, M. A. Hurn, 2 1 INAF Osservatorio Astronomico di Brera, Milano, Italy 2 University

### Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

### Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

### Basic math for biology

Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood

### Approximate Bayesian Computation for Astrostatistics

Approximate Bayesian Computation for Astrostatistics Jessi Cisewski Department of Statistics Yale University October 24, 2016 SAMSI Undergraduate Workshop Our goals Introduction to Bayesian methods Likelihoods,

### Bayesian linear regression

Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

### Generalized Linear Models (GLZ)

Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the

### Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

Advanced topics from statistics Anders Ringgaard Kristensen Advanced Herd Management Slide 1 Outline Covariance and correlation Random vectors and multivariate distributions The multinomial distribution

### Review of Statistics

Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

### Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

### The Poisson transform for unnormalised statistical models. Nicolas Chopin (ENSAE) joint work with Simon Barthelmé (CNRS, Gipsa-LAB)

The Poisson transform for unnormalised statistical models Nicolas Chopin (ENSAE) joint work with Simon Barthelmé (CNRS, Gipsa-LAB) Part I Unnormalised statistical models Unnormalised statistical models

### Basic Sampling Methods

Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution

### MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

### Statistical Methods for Astronomy

Statistical Methods for Astronomy If your experiment needs statistics, you ought to have done a better experiment. -Ernest Rutherford Lecture 1 Lecture 2 Why do we need statistics? Definitions Statistical

### Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH

Lecture 5: Spatial probit models James P. LeSage University of Toledo Department of Economics Toledo, OH 43606 jlesage@spatial-econometrics.com March 2004 1 A Bayesian spatial probit model with individual

### Linear Regression. Junhui Qian. October 27, 2014

Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

### Stat 451 Lecture Notes Monte Carlo Integration

Stat 451 Lecture Notes 06 12 Monte Carlo Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 23 in Lange, and Chapters 3 4 in Robert & Casella 2 Updated:

### Theory and Methods of Statistical Inference

PhD School in Statistics cycle XXIX, 2014 Theory and Methods of Statistical Inference Instructors: B. Liseo, L. Pace, A. Salvan (course coordinator), N. Sartori, A. Tancredi, L. Ventura Syllabus Some prerequisites:

### PART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics

Table of Preface page xi PART I INTRODUCTION 1 1 The meaning of probability 3 1.1 Classical definition of probability 3 1.2 Statistical definition of probability 9 1.3 Bayesian understanding of probability

### Bayesian Heteroskedasticity-Robust Regression. Richard Startz * revised February Abstract

Bayesian Heteroskedasticity-Robust Regression Richard Startz * revised February 2015 Abstract I offer here a method for Bayesian heteroskedasticity-robust regression. The Bayesian version is derived by

### Period. End of Story?

Period. End of Story? Twice Around With Period Detection: Updated Algorithms Eric L. Michelsen UCSD CASS Journal Club, January 9, 2015 Can you read this? 36 pt This is big stuff 24 pt This makes a point

### Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

### PUSHING THE LIMITS OF AMS RADIOCARBON DATING WITH IMPROVED BAYESIAN DATA ANALYSIS

RADIOCARBON, Vol 49, Nr 3, 2007, p 1261 1272 2007 by the Arizona Board of Regents on behalf of the University of Arizona PUSHING THE LIMITS OF AMS RADIOCARBON DATING WITH IMPROVED BAYESIAN DATA ANALYSIS

### CPSC 540: Machine Learning

CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

### Eric Shou Stat 598B / CSE 598D METHODS FOR MICRODATA PROTECTION

Eric Shou Stat 598B / CSE 598D METHODS FOR MICRODATA PROTECTION INTRODUCTION Statistical disclosure control part of preparations for disseminating microdata. Data perturbation techniques: Methods assuring

### arxiv:astro-ph/ v1 14 Sep 2005

For publication in Bayesian Inference and Maximum Entropy Methods, San Jose 25, K. H. Knuth, A. E. Abbas, R. D. Morris, J. P. Castle (eds.), AIP Conference Proceeding A Bayesian Analysis of Extrasolar

### A Single Series from the Gibbs Sampler Provides a False Sense of Security

A Single Series from the Gibbs Sampler Provides a False Sense of Security Andrew Gelman Department of Statistics University of California Berkeley, CA 9472 Donald B. Rubin Department of Statistics Harvard

### Problem Solving. Correlation and Covariance. Yi Lu. Problem Solving. Yi Lu ECE 313 2/51

Yi Lu Correlation and Covariance Yi Lu ECE 313 2/51 Definition Let X and Y be random variables with finite second moments. the correlation: E[XY ] Yi Lu ECE 313 3/51 Definition Let X and Y be random variables

### Spatial inference. Spatial inference. Accounting for spatial correlation. Multivariate normal distributions

Spatial inference I will start with a simple model, using species diversity data Strong spatial dependence, Î = 0.79 what is the mean diversity? How precise is our estimate? Sampling discussion: The 64

### Probability distributions. Probability Distribution Functions. Probability distributions (contd.) Binomial distribution

Probability distributions Probability Distribution Functions G. Jogesh Babu Department of Statistics Penn State University September 27, 2011 http://en.wikipedia.org/wiki/probability_distribution We discuss

### Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Linear Regression with 1 Regressor Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor 1. The regression equation 2. Estimating the equation 3. Assumptions required for

### Luke B Smith and Brian J Reich North Carolina State University May 21, 2013

BSquare: An R package for Bayesian simultaneous quantile regression Luke B Smith and Brian J Reich North Carolina State University May 21, 2013 BSquare in an R package to conduct Bayesian quantile regression

### Lecture Notes based on Koop (2003) Bayesian Econometrics

Lecture Notes based on Koop (2003) Bayesian Econometrics A.Colin Cameron University of California - Davis November 15, 2005 1. CH.1: Introduction The concepts below are the essential concepts used throughout

### Discussion of Maximization by Parts in Likelihood Inference

Discussion of Maximization by Parts in Likelihood Inference David Ruppert School of Operations Research & Industrial Engineering, 225 Rhodes Hall, Cornell University, Ithaca, NY 4853 email: dr24@cornell.edu

### Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January

### Introduction to Bayesian methods in inverse problems

Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction

### A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

### Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory

### Non-Gaussian Berkson Errors in Bioassay

Non-Gaussian Berkson Errors in Bioassay Alaa Althubaiti & Alexander Donev First version: 1 May 011 Research Report No., 011, Probability and Statistics Group School of Mathematics, The University of Manchester

### Multiple Regression Analysis

Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,

### Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

### A Note on Bayesian Inference After Multiple Imputation

A Note on Bayesian Inference After Multiple Imputation Xiang Zhou and Jerome P. Reiter Abstract This article is aimed at practitioners who plan to use Bayesian inference on multiplyimputed datasets in

### 10. Joint Moments and Joint Characteristic Functions

10. Joint Moments and Joint Characteristic Functions Following section 6, in this section we shall introduce various parameters to compactly represent the inormation contained in the joint p.d. o two r.vs.

### An introduction to Markov Chain Monte Carlo techniques

An introduction to Markov Chain Monte Carlo techniques G. J. A. Harker University of Colorado ASTR5550, 19th March 2012 Outline Introduction Bayesian inference: recap MCMC: when to use it and why A simple

### Chapter 9. Non-Parametric Density Function Estimation

9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

### PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA Contents in latter part Linear Dynamical Systems What is different from HMM? Kalman filter Its strength and limitation Particle Filter

### Scott Gaudi The Ohio State University Sagan Summer Workshop August 9, 2017

Single Lens Lightcurve Modeling Scott Gaudi The Ohio State University Sagan Summer Workshop August 9, 2017 Outline. Basic Equations. Limits. Derivatives. Degeneracies. Fitting. Linear Fits. Multiple Observatories.

### Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop

Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector

### ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

### Markov Chain Monte Carlo in Practice

Markov Chain Monte Carlo in Practice Edited by W.R. Gilks Medical Research Council Biostatistics Unit Cambridge UK S. Richardson French National Institute for Health and Medical Research Vilejuif France

### Chapter 9. Non-Parametric Density Function Estimation

9-1 Density Estimation Version 1.1 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

### Nonparametric Bayesian Methods - Lecture I

Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics

### DAG models and Markov Chain Monte Carlo methods a short overview

DAG models and Markov Chain Monte Carlo methods a short overview Søren Højsgaard Institute of Genetics and Biotechnology University of Aarhus August 18, 2008 Printed: August 18, 2008 File: DAGMC-Lecture.tex

### Department of Large Animal Sciences. Outline. Slide 2. Department of Large Animal Sciences. Slide 4. Department of Large Animal Sciences

Outline Advanced topics from statistics Anders Ringgaard Kristensen Covariance and correlation Random vectors and multivariate distributions The multinomial distribution The multivariate normal distribution

### Probabilistic machine learning group, Aalto University Bayesian theory and methods, approximative integration, model

Aki Vehtari, Aalto University, Finland Probabilistic machine learning group, Aalto University http://research.cs.aalto.fi/pml/ Bayesian theory and methods, approximative integration, model assessment and

### Estimation of Operational Risk Capital Charge under Parameter Uncertainty

Estimation of Operational Risk Capital Charge under Parameter Uncertainty Pavel V. Shevchenko Principal Research Scientist, CSIRO Mathematical and Information Sciences, Sydney, Locked Bag 17, North Ryde,

### Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing

### Petr Volf. Model for Difference of Two Series of Poisson-like Count Data

Petr Volf Institute of Information Theory and Automation Academy of Sciences of the Czech Republic Pod vodárenskou věží 4, 182 8 Praha 8 e-mail: volf@utia.cas.cz Model for Difference of Two Series of Poisson-like

### Summary and discussion of The central role of the propensity score in observational studies for causal effects

Summary and discussion of The central role of the propensity score in observational studies for causal effects Statistics Journal Club, 36-825 Jessica Chemali and Michael Vespe 1 Summary 1.1 Background

### Bayesian model selection in graphs by using BDgraph package

Bayesian model selection in graphs by using BDgraph package A. Mohammadi and E. Wit March 26, 2013 MOTIVATION Flow cytometry data with 11 proteins from Sachs et al. (2005) RESULT FOR CELL SIGNALING DATA

### Variational Principal Components

Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

### PILCO: A Model-Based and Data-Efficient Approach to Policy Search

PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol

### Association studies and regression

Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

### The regression model with one stochastic regressor.

The regression model with one stochastic regressor. 3150/4150 Lecture 6 Ragnar Nymoen 30 January 2012 We are now on Lecture topic 4 The main goal in this lecture is to extend the results of the regression

### Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for