A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

Similar documents
A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

Frequentist-Bayesian Model Comparisons: A Simple Example

F & B Approaches to a simple model

Bayesian Inference in Astronomy & Astrophysics A Short Course

Introduction to Bayesian Data Analysis

A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University

A523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011

Graphical Models for Collaborative Filtering

Bayesian Asymptotics

Basics of Bayesian analysis. Jorge Lillo-Box MCMC Coffee Season 1, Episode 5

Additional Keplerian Signals in the HARPS data for Gliese 667C from a Bayesian re-analysis

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric?

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

20: Gaussian Processes

A Very Brief Summary of Bayesian Inference, and Examples

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model

Bayesian Machine Learning

New Bayesian methods for model comparison

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

Lecture 13 Fundamentals of Bayesian Inference

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Introduction to Bayesian Inference

Hierarchical Bayesian Modeling of Planet Populations

MAS3301 Bayesian Statistics

arxiv:astro-ph/ v1 14 Sep 2005

Bayesian Analysis of RR Lyrae Distances and Kinematics

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Choosing among models

Probing the covariance matrix

Probabilistic Graphical Models Lecture 20: Gaussian Processes

Kernel methods, kernel SVM and ridge regression

Non-Parametric Bayes

STA 4273H: Statistical Machine Learning

SAMSI Astrostatistics Tutorial. Models with Gaussian Uncertainties (lecture 2)

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Bayes via forward simulation. Approximate Bayesian Computation

SAMSI Astrostatistics Tutorial. More Markov chain Monte Carlo & Demo of Mathematica software

Hierarchical Bayesian Modeling

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm

Signal Modeling, Statistical Inference and Data Mining in Astrophysics

Minimum Message Length Inference and Mixture Modelling of Inverse Gaussian Distributions

Miscellany : Long Run Behavior of Bayesian Methods; Bayesian Experimental Design (Lecture 4)

A Very Brief Summary of Statistical Inference, and Examples

Model comparison and selection

Hierarchical Bayesian Modeling

Parametric Techniques Lecture 3

Topic 12 Overview of Estimation

SCUOLA DI SPECIALIZZAZIONE IN FISICA MEDICA. Sistemi di Elaborazione dell Informazione. Regressione. Ruggero Donida Labati

Bayesian Model Selection & Extrasolar Planet Detection

Bayesian Machine Learning

13 : Variational Inference: Loopy Belief Propagation and Mean Field

Accounting for Calibration Uncertainty in Spectral Analysis. David A. van Dyk

1. Fisher Information

Unsupervised Learning

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.

A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University

Accounting for Calibration Uncertainty in Spectral Analysis. David A. van Dyk

Stat 451 Lecture Notes Numerical Integration

Teaching Bayesian Model Comparison With the Three-sided Coin

SIMON FRASER UNIVERSITY School of Engineering Science

Approximate Bayesian Computation

Introduction to Bayesian Methods

Patterns in astronomical impacts on the Earth: Testing the claims

Foundations of Statistical Inference

COS513 LECTURE 8 STATISTICAL CONCEPTS

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over

CSC321 Lecture 18: Learning Probabilistic Models

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Lecture 4: Probabilistic Learning

Estimation, Detection, and Identification

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model

STAT 425: Introduction to Bayesian Analysis

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017

Virtual Sensors and Large-Scale Gaussian Processes

STA 4273H: Sta-s-cal Machine Learning

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

Bayesian Methods for Machine Learning

An Introduction to Bayesian Linear Regression

Variational inference

Default priors and model parametrization

Machine Learning Lecture 2

Recent advances in cosmological Bayesian model comparison

Tutorial: Statistical distance and Fisher information

Matrix Operations. Linear Combination Vector Algebra Angle Between Vectors Projections and Reflections Equality of matrices, Augmented Matrix

Part III. A Decision-Theoretic Approach and Bayesian testing

Based on slides by Richard Zemel

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

Introduction to Bayesian Inference: Supplemental Topics

6.1 Variational representation of f-divergences

Bayes Factors, posterior predictives, short intro to RJMCMC. Thermodynamic Integration

STA 294: Stochastic Processes & Bayesian Nonparametrics

November 2002 STA Random Effects Selection in Linear Mixed Models

Parametric Techniques

Tutorial on Approximate Bayesian Computation

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Bayesian Model Comparison

Transcription:

Lecture 9 A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2015 http://www.astro.cornell.edu/~cordes/a6523 Applications: Comparison of Frequentist and Bayesian inference Introduction to data mining in large-scale surveys Reading: Gregory chapters 6,7; Hancock article Lecture 10 (Thursday 26 Feb): Adam Brazier (Cornell Center for Advanced Computing) will talk about astronomy-survey workflows and the howto of databases

Topics for Lecture 10 this week Sensor data (e.g. telescope data) often requires further filtering and cross-comparisons of the global output. By storing output in a database we can query our data products efficiently and with a wide variety of qualifiers and filters. Databases, particularly relational databases, are used in many fields, including industry, to store information in a form that can be efficiently queried. We will introduce the relational database structure, how they can be queried, how they should be designed and how they can be incorporated into the scientific workflow.

Topics Plan Bayesian inference Detection problems Matched filtering and localization Modeling (linear, nonlinear) Cost functions Parameter estimation and errors Optimization methods Hill climbing, annealing, genetic algorithms MCMC variants (Gibbs, Hamiltonian) Generalized spectral analysis Lomb-Scargle Maximum entropy High resolution method Bayesian approaches Wavelets Principal components Cholesky decomposition Large scale surveys in astronomy Time domain Spectral line Images and image cubes Detection & characterization of events, sources, objects Known object types Unknown object types Current algorithms Data mining tools Databases Distributed processing

Some Matrix Manipulations We will do consider matrix manipulations in our treatment of model fitting, optimization, and basis vectors. Dot product: we can write this in several ways: c = a b = j a j b j = a j b j (summation convention for repeated indices) Derivative: ac =(â i ai ) c =(â i ai ) a j b j = â i b i = b Transformation: y = Ax =col j A ij x j =col(a ij x j ) xy = ˆx k xk A ij x j =(ˆxk A ik )=A Quadratic form: Q = x t Ax = scalar xq = ˆx k xk A ij x i x j = ˆx k (A kj x j + A ik x i )=Ax + A t x =(A + A t )x 1

Consider vectors A, B and matrix C with lengths N 1, N 1, and N N, respectively. Show that (a) A A B = B (b) A A 2 =2A (c) A A CA =(C + C)A for real A. (d) A A CA = C A + CA for complex A. (e) (CA) = A C (f) If A is a zero mean stochastic process (e.g. a vector of N measurements of a noiselike signal), its covariance matrix can be written as C = AA. Here the notation is: = conjugate; t = transpose conjugate; = transpose conjugate. 2

Inference Questions Finding objects (detection) t, ν, λ, θ, (θ, t, λ) (θ, t, λ, π), etc. (π=polarization) Blind surveys (no prior info) We typically know a lot about object types and shapes + instrumental resolutions (PSF) Characterizing individual objects Parametric models (estimation) Modeling object populations Distributions of objects of same class Comparison of populations (crosscatalog) Supernova remnants and NS,BH GRBs and galaxies (partial correlation) Stars Galaxies Asteriods Gas clouds M, T, R, V, line ra8os, spin periods, orbits, tests of physics HR diagram, SMBH- galaxy halo, Exoplanet distribu8ons, prevalence of Earth- like planets

Examples of detections Change points Change in mean Change in variance Change in slope Comparison of models with and without change points Finding clusters in measurement space or parameter space One type of object detection Clustering algorithms Bayesian blocks Object shape known: matched filtering Correlation function based

Bayesian Model Comparison Bayeisian parameter estimation: P (θ M) = where the normalization is P (D M) = P (θ M)P (D θ,m) P (D M) dθ P (θ M)P (D θ,m) We can rewrite this in terms of the likelihood L(θ D, M) =P (D θ,m) as P (θ M)L(θ D, M) P (θ M) = dθ P (θ M)L(θ D, M) Global likelihood: L(M) P (D M) = dθ P (θ M)L(θ D, M) Comparison of alternative models: M i,i=1, 2... Extrapolate from parameter space to model space: model M i = L(M i ) Prior for model M i : Posterior for model M i : P (M i I) P (M i D, I) = P (M i I)L(M i ) P (D I) 1

Odds ratio: An implementation of Occam s razor. O ij = P (M i D, I) P (M j D, I) = P (M i I) P (M j I) L(M i ) L(M j ) = ratio of priors Bayes factor B ij. For the Bayes factor we need the global likelihood for each model: L(M i )= dθ P (θ M i )L(θ D, M i ) 1D case: Suppose the prior is flat and uninformative with width θ Also suppose the likelihood function L(θ D, M i ) is unimodal with width δθ < θ. Then we can approximate the global likelihood for M i as L(M i )= dθ P (θ M i )L(θ D, M i ) L(θ D, M i) max δθ θ L(ˆθ D, M i ) δθ θ) The Bayes factor becomes B ij = L(M i) L(M j ) L( ˆθ i D, M i ) L( ˆθ j D, M j ) δθi θj = Tradeoff between amplitudes of the maximum likelihoods, the widths of the likelihood functions, and the widths of the priors. The Bayes factor penalizes models that require larger volumes of parameter space to be searched. 2 δθ j θ i

Figure 1: A noninformative prior for the mean, µ. In this case, a flat prior PDF, f µ (µ), is shown along with a likelihood function, L(µ), that is much narrower than the prior. The peak of L is the maximum likelihood estimate for µ and is the arithmetic mean of the data:

F & B Approaches to a simple model M1 = constant y = a M2 = line y = a + b x