Large Scale Bayesian Inference

Similar documents
STA 4273H: Statistical Machine Learning

Monte Carlo in Bayesian Statistics

Data Analysis Methods

Paul Karapanagiotidis ECO4060

Introduction to CosmoMC

Forward Problems and their Inverse Solutions

Advanced Statistical Methods. Lecture 6

Bayesian Regression Linear and Logistic Regression

Hierarchical Bayesian Modeling

Probing the covariance matrix

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

CosmoSIS Webinar Part 1

Contents. Part I: Fundamentals of Bayesian Inference 1

PART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics

Sampling versus optimization in very high dimensional parameter spaces

Kernel adaptive Sequential Monte Carlo

POWER SPECTRUM ESTIMATION FOR J PAS DATA

Multimodal Nested Sampling

19 : Slice Sampling and HMC

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

The flickering luminosity method

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Inference of Galaxy Population Statistics Using Photometric Redshift Probability Distribution Functions

Using training sets and SVD to separate global 21-cm signal from foreground and instrument systematics

Log Gaussian Cox Processes. Chi Group Meeting February 23, 2016

Monte Carlo Markov Chains: A Brief Introduction and Implementation. Jennifer Helsby Astro 321

MCMC and Gibbs Sampling. Kayhan Batmanghelich

Transdimensional Markov Chain Monte Carlo Methods. Jesse Kolb, Vedran Lekić (Univ. of MD) Supervisor: Kris Innanen

17 : Markov Chain Monte Carlo

Markov Chain Monte Carlo, Numerical Integration

Modelling Operational Risk Using Bayesian Inference

Bayesian inference for multivariate extreme value distributions

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

MCMC Sampling for Bayesian Inference using L1-type Priors

Baryon acoustic oscillations A standard ruler method to constrain dark energy

Statistics and Prediction. Tom

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Refining Photometric Redshift Distributions with Cross-Correlations

Bayesian Inference and MCMC

Introduction to Bayes

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006

Markov Chain Monte Carlo

Reminder of some Markov Chain properties:

Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J.

Pattern Recognition and Machine Learning

Notes on pseudo-marginal methods, variational Bayes and ABC

Introduction to Machine Learning CMU-10701

An introduction to Markov Chain Monte Carlo techniques

Data Release 5. Sky coverage of imaging data in the DR5

Introduction to Bayesian methods in inverse problems

CPSC 540: Machine Learning

Systematic uncertainties in statistical data analysis for particle physics. DESY Seminar Hamburg, 31 March, 2009

The Power. of the Galaxy Power Spectrum. Eric Linder 13 February 2012 WFIRST Meeting, Pasadena

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

Markov Chain Monte Carlo (MCMC)

Learning the hyper-parameters. Luca Martino

Information Field Theory. Torsten Enßlin MPI for Astrophysics Ludwig Maximilian University Munich

Down by the Bayes, where the Watermelons Grow

Active Learning for Fitting Simulation Models to Observational Data

Gravitational Wave Astronomy s Next Frontier in Computation

Introduction to Machine Learning

Markov Chain Monte Carlo methods

Eco517 Fall 2013 C. Sims MCMC. October 8, 2013

A Review of Pseudo-Marginal Markov Chain Monte Carlo

Galaxy Bias from the Bispectrum: A New Method

Tutorial on Probabilistic Programming with PyMC3

Sequential Monte Carlo Samplers for Applications in High Dimensions

Morphology and Topology of the Large Scale Structure of the Universe

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference

Bayesian model selection in graphs by using BDgraph package

Likelihood-free MCMC

Bayesian Phylogenetics:

Lecture 7 and 8: Markov Chain Monte Carlo

VCMC: Variational Consensus Monte Carlo

Bayesian PalaeoClimate Reconstruction from proxies:

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Statistical Searches in Astrophysics and Cosmology

Probabilistic Graphical Models

Gaussian Process Approximations of Stochastic Differential Equations

Leïla Haegel University of Geneva

A tool to test galaxy formation theories. Joe Wolf (UC Irvine)

F denotes cumulative density. denotes probability density function; (.)

Uncertainty quantification for inverse problems with a weak wave-equation constraint

ST 740: Markov Chain Monte Carlo

Time Delay Estimation: Microlensing

Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems

Risk Estimation and Uncertainty Quantification by Markov Chain Monte Carlo Methods

CSC 2541: Bayesian Methods for Machine Learning

An introduction to Bayesian reasoning in particle physics

Bayesian Inference in Astronomy & Astrophysics A Short Course

Supplementary Note on Bayesian analysis

Uncertainty quantification for Wavefield Reconstruction Inversion

Markov chain Monte Carlo methods in atmospheric remote sensing

Wavelets Applied to CMB Analysis

WALLABY. Jeremy Mould WALLABY Science Meeting December 2, 2010

The University of Auckland Applied Mathematics Bayesian Methods for Inverse Problems : why and how Colin Fox Tiangang Cui, Mike O Sullivan (Auckland),

Numerical Analysis for Statisticians

Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation. Luke Tierney Department of Statistics & Actuarial Science University of Iowa

Bayesian networks: approximate inference

Transcription:

Large Scale Bayesian I in Cosmology Jens Jasche Garching, 11 September 2012

Introduction Cosmography 3D density and velocity fields Power-spectra, bi-spectra Dark Energy, Dark Matter, Gravity Cosmological parameters Large Scale Bayesian i High dimensional ( ~ 10^7 parameters ) State-of-the-art technology On the verge of numerical feasibility

Introduction Cosmography 3D density and velocity fields Power-spectra, bi-spectra Dark Energy, Dark Matter, Gravity Cosmological parameters Large Scale Bayesian i High dimensional ( ~ 10^7 parameters ) State-of-the-art technology On the verge of numerical feasibility

Introduction Why do we need Bayesian i?

Introduction Why do we need Bayesian i? I of signals = ill-posed problem

Introduction Why do we need Bayesian i? I of signals = ill-posed problem Noise Incomplete observations Systematics

Introduction Why do we need Bayesian i? I of signals = ill-posed problem Noise Incomplete observations Systematics

Introduction Why do we need Bayesian i? I of signals = ill-posed problem Noise Incomplete observations Systematics

Introduction Why do we need Bayesian i? I of signals = ill-posed problem Noise Incomplete observations Systematics No unique recovery possible!!!

Introduction What are the possible signals compatible with observations?

Introduction What are the possible signals compatible with observations? Object of interest: Signal posterior distribution We can do science! Model comparison Parameter studies Report statistical summaries Non-linear, Non-Gaussian error propagation

Introduction What are the possible signals compatible with observations? Object of interest: Signal posterior distribution We can do science! Model comparison Parameter studies Report statistical summaries Non-linear, Non-Gaussian error propagation

Markov Chain Monte Carlo Problems: High dimensional (~10^7 parameter) A large number of correlated parameters No reduction of problem size possible Complex posterior distributions Numerical approximation Dim > 4 MCMC Metropolis-Hastings

Markov Chain Monte Carlo Problems: High dimensional (~10^7 parameter) A large number of correlated parameters No reduction of problem size possible Complex posterior distributions Numerical approximation Dim > 4 MCMC Metropolis-Hastings

Markov Chain Monte Carlo Problems: High dimensional (~10^7 parameter) A large number of correlated parameters No reduction of problem size possible Complex posterior distributions Numerical approximation Dim > 4 MCMC Metropolis-Hastings

Markov Chain Monte Carlo Problems: High dimensional (~10^7 parameter) A large number of correlated parameters No reduction of problem size possible Complex posterior distributions Numerical approximation Dim > 4 MCMC Metropolis-Hastings

Markov Chain Monte Carlo Problems: High dimensional (~10^7 parameter) A large number of correlated parameters No reduction of problem size possible Complex posterior distributions Numerical approximation Dim > 4 MCMC Metropolis-Hastings

Hamiltonian sampling Parameter space exploration via Hamiltonian sampling interpret log-posterior as potential introduce Gaussian auxiliary momentum variable rltant joint posterior distribution of separable in and marginalization over and yields again

Hamiltonian sampling Parameter space exploration via Hamiltonian sampling interpret log-posterior as potential introduce Gaussian auxiliary momentum variable resultant joint posterior distribution of separable in and marginalization over yields again and

Hamiltonian sampling Parameter space exploration via Hamiltonian sampling interpret log-posterior as potential introduce Gaussian auxiliary momentum variable resultant joint posterior distribution of separable in and marginalization over yields again and

Hamiltonian sampling Parameter space exploration via Hamiltonian sampling interpret log-posterior as potential introduce Gaussian auxiliary momentum variable resultant joint posterior distribution of separable in and marginalization over yields again and

Hamiltonian sampling Parameter space exploration via Hamiltonian sampling interpret log-posterior as potential introduce Gaussian auxiliary momentum variable resultant joint posterior distribution of separable in and marginalization over yields again and

Hamiltonian sampling IDEA: Use Hamiltonian dynamics to explore solve Hamiltonian system to obtain new sample Hamiltonian dynamics conserve the Hamiltonian Metropolis acceptance probability is unity All samples are accepted

Hamiltonian sampling IDEA: Use Hamiltonian dynamics to explore solve Hamiltonian system to obtain new sample Hamiltonian dynamics conserve the Hamiltonian Metropolis acceptance probability is unity All samples are accepted

Hamiltonian sampling IDEA: Use Hamiltonian dynamics to explore solve Hamiltonian system to obtain new sample Hamiltonian dynamics conserve the Hamiltonian Metropolis acceptance probability is unity All samples are accepted

Hamiltonian sampling IDEA: Use Hamiltonian dynamics to explore solve Hamiltonian system to obtain new sample Hamiltonian dynamics conserve the Hamiltonian Metropolis acceptance probability is unity All samples are accepted

Hamiltonian sampling IDEA: Use Hamiltonian dynamics to explore solve Hamiltonian system to obtain new sample Hamiltonian dynamics conserve the Hamiltonian Metropolis acceptance probability is unity All samples are accepted

Hamiltonian sampling Example: Wiener posterior = multivariate normal distribution

Hamiltonian sampling Example: Wiener posterior = multivariate normal distribution Prior

Hamiltonian sampling Example: Wiener posterior = multivariate normal distribution Likelihood

Hamiltonian sampling Example: Wiener posterior = multivariate normal distribution

Hamiltonian sampling Example: Wiener posterior = multivariate normal distribution EOM: coupled harmonic oscillator

Hamiltonian sampling How to set the Mass matrix? Large number of tunable parameter Determines efficiency of sampler Mass matrix aims at decoupling the system In practice: use diagonal approximation The quality of approximation determines sampler efficiency Non-Gaussian case: Taylor expand to find Mass matrix

Hamiltonian sampling How to set the Mass matrix? Large number of tunable parameter Determines efficiency of sampler Mass matrix aims at decoupling the system In practice: use diagonal approximation The quality of approximation determines sampler efficiency Non-Gaussian case: Taylor expand to find Mass matrix

Hamiltonian sampling How to set the Mass matrix? Large number of tunable parameter Determines efficiency of sampler Mass matrix aims at decoupling the system In practice: use diagonal approximation The quality of approximation determines sampler efficiency Non-Gaussian case: Taylor expand to find Mass matrix

Hamiltonian sampling How to set the Mass matrix? Large number of tunable parameter Determines efficiency of sampler Mass matrix aims at decoupling the system In practice: use diagonal approximation The quality of approximation determines sampler efficiency Non-Gaussian case: Taylor expand to find Mass matrix

Hamiltonian sampling How to set the Mass matrix? Large number of tunable parameter Determines efficiency of sampler Mass matrix aims at decoupling the system In practice: use diagonal approximation The quality of approximation determines sampler efficiency Non-Gaussian case: Taylor expand to find Mass matrix

Hamiltonian sampling How to set the Mass matrix? Large number of tunable parameter Determines efficiency of sampler Mass matrix aims at decoupling the system In practice: use diagonal approximation The quality of approximation determines sampler efficiency Non-Gaussian case: Taylor expand to find Mass matrix

Hamiltonian sampling How to set the Mass matrix? Large number of tunable parameter Determines efficiency of sampler Mass matrix aims at decoupling the system In practice: use diagonal approximation The quality of approximation determines sampler efficiency Non-Gaussian case: Taylor expand to find Mass matrix

Hamiltonian sampling How to set the Mass matrix? Large number of tunable parameter Determines efficiency of sampler Mass matrix aims at decoupling the system In practice: use diagonal approximation The quality of approximation determines sampler efficiency Non-Gaussian case: Taylor expand to find Mass matrix

HMC in action I of non-linear density fields in cosmology Non-linear density field Log-normal prior See e.g. Coles & Jones (1991), Kayo et al. (2001) Galaxy distribution Poisson likelihood Signal dependent noise Credit: M. Blanton and the Sloan Digital Sky Survey Problem: Non-Gaussian sampling in high dimensions HADES (HAmiltonian Density Estimation and Sampling) Jasche, Kitaura (2010)

HMC in action I of non-linear density fields in cosmology Non-linear density field Log-normal prior See e.g. Coles & Jones (1991), Kayo et al. (2001) Galaxy distribution Poisson likelihood Signal dependent noise Credit: M. Blanton and the Sloan Digital Sky Survey Problem: Non-Gaussian sampling in high dimensions HADES (HAmiltonian Density Estimation and Sampling) Jasche, Kitaura (2010)

HMC in action I of non-linear density fields in cosmology Non-linear density field Log-normal prior See e.g. Coles & Jones (1991), Kayo et al. (2001) Galaxy distribution Poisson likelihood Signal dependent noise Credit: M. Blanton and the Sloan Digital Sky Survey Problem: Non-Gaussian sampling in high dimensions HADES (HAmiltonian Density Estimation and Sampling) Jasche, Kitaura (2010)

LSS i with the SDSS Application of HADES to SDSS DR7 cubic, equidistant box with sidelength 750 Mpc ~ 3 Mpc grid resolution ~ 10^7 volume elements / parameters Jasche, Kitaura, Li, Enßlin J. Jasche, Bayesian LSS I (2010)

LSS i with the SDSS Application of HADES to SDSS DR7 cubic, equidistant box with sidelength 750 Mpc ~ 3 Mpc grid resolution ~ 10^7 volume elements / parameters Goal: provide a representation of the SDSS density posterior to provide 3D cosmographic descriptions to quantify uncertainties of the density distribution Jasche, Kitaura, Li, Enßlin J. Jasche, Bayesian LSS I (2010)

LSS i with the SDSS

Multiple Block Sampling What if the HMC is not an option? Problem: Design of good proposal distributions High rejection rates Multiple block sampling ( see e.g. Hastings (1997) ) Break down into subproblems Serial processing only! simplifies design of conditional proposal distributions Average acceptance rate is higher Requires serial processing

Multiple Block Sampling What if the HMC is not an option? Problem: Design of good proposal distributions High rejection rates Multiple block sampling ( see e.g. Hastings (1997) ) Break down into subproblems Serial processing only! simplifies design of conditional proposal distributions Average acceptance rate is higher Requires serial processing

Multiple Block Sampling What if the HMC is not an option? Problem: Design of good proposal distributions High rejection rates Multiple block sampling ( see e.g. Hastings (1997) ) Break down into subproblems Serial processing only! simplifies design of conditional proposal distributions Average acceptance rate is higher Requires serial processing

Multiple Block Sampling What if the HMC is not an option? Problem: Design of good proposal distributions High rejection rates Multiple block sampling ( see e.g. Hastings (1997) ) Break down into subproblems Serial processing only! simplifies design of conditional proposal distributions Average acceptance rate is higher Requires serial processing

Multiple Block Sampling What if the HMC is not an option? Problem: Design of good proposal distributions High rejection rates Multiple block sampling ( see e.g. Hastings (1997) ) Break down into subproblems simplifies design of conditional proposal distributions Average acceptance rate is higher Requires serial processing

Multiple Block Sampling What if the HMC is not an option? Problem: Design of good proposal distributions High rejection rates Multiple block sampling ( see e.g. Hastings (1997) ) Break down into subproblems Serial processing only! simplifies design of conditional proposal distributions Average acceptance rate is higher Requires serial processing

Multiple Block Sampling What if the HMC is not an option? Problem: Design of good proposal distributions High rejection rates Multiple block sampling ( see e.g. Hastings (1997) ) Break down into subproblems Serial processing only! simplifies design of conditional proposal distributions Average acceptance rate is higher Requires serial processing

Multiple Block Sampling What if the HMC is not an option? Problem: Design of good proposal distributions High rejection rates Multiple block sampling ( see e.g. Hastings (1997) ) Break down into subproblems Serial processing only! simplifies design of conditional proposal distributions Average acceptance rate is higher Requires serial processing

Multiple Block Sampling Can we boost block sampling? Sometimes it is easier to explore full joint the PDF Block sampler: Permits efficient sampling for numerical expensive posteriors

Multiple Block Sampling Can we boost block sampling? Sometimes it is easier to explore full joint the PDF Block sampler: Permits efficient sampling for numerical expensive posteriors

Multiple Block Sampling Can we boost block sampling? Sometimes it is easier to explore full joint the PDF Block sampler: process in parallel! Permits efficient sampling for numerical expensive posteriors

Multiple Block Sampling Can we boost block sampling? Sometimes it is easier to explore full joint the PDF Block sampler: process in parallel! Permits efficient sampling for numerical expensive posteriors

Multiple Block Sampling Can we boost block sampling? Sometimes it is easier to explore full joint the PDF Block sampler: process in parallel! Permits efficient sampling for numerical expensive posteriors

Photometric redshift sampling Photometric surveys millions of galaxies ( ~10^7-10^8) low redshift accuracy ( ~ 100 Mpc along LOS) Infer accurate redshifts: Rather sample from joint distribution: Block sampler: Process in parallel! HMC sampler!

Photometric redshift sampling Photometric surveys millions of galaxies ( ~10^7-10^8) low redshift accuracy ( ~ 100 Mpc along LOS) Infer accurate redshifts: Rather sample from joint distribution: Block sampler:

Photometric redshift sampling Photometric surveys millions of galaxies ( ~10^7-10^8) low redshift accuracy ( ~ 100 Mpc along LOS) Infer accurate redshifts: Rather sample from joint distribution: Block sampler:

Photometric redshift sampling Photometric surveys millions of galaxies ( ~10^7-10^8) low redshift accuracy ( ~ 100 Mpc along LOS) Infer accurate redshifts: Rather sample from joint distribution: Block sampler:

Photometric redshift sampling Photometric surveys millions of galaxies ( ~10^7-10^8) low redshift accuracy ( ~ 100 Mpc along LOS) Infer accurate redshifts: Rather sample from joint distribution: Block sampler:

Photometric redshift sampling Photometric surveys millions of galaxies ( ~10^7-10^8) low redshift accuracy ( ~ 100 Mpc along LOS) Infer accurate redshifts: Rather sample from joint distribution: Block sampler: Process in parallel!

Photometric redshift sampling Photometric surveys millions of galaxies ( ~10^7-10^8) low redshift accuracy ( ~ 100 Mpc along LOS) Infer accurate redshifts: Rather sample from joint distribution: Block sampler: Process in parallel! HMC sampler!

Photometric redshift sampling Application to artificial photometric data ~ Noise, Systematics, Position uncertainty (~100 Mpc) ~ 10^7 density amplitudes /parameters ~ 2x10^7 radial galaxy positions / parameters J. Jasche, Bayesian LSS I ~ 3x10^7 parameters in total

Photometric redshift sampling Application to artificial photometric data ~ Noise, Systematics, Position uncertainty (~100 Mpc) ~ 10^7 density amplitudes /parameters ~ 2x10^7 radial galaxy positions / parameters ~ 3x10^7 parameters J. Jasche, Bayesian LSS I in total

Photometric redshift sampling Jasche, Wandelt (2012)

Deviation from the truth Before After Jasche, Wandelt (2012)

Deviation from the truth Raw data Density estimate Jasche, Wandelt (2012)

4D physical i Physical motivation Complex final state Simple initial state

4D physical i Physical motivation Complex final state Simple initial state Final state

4D physical i Physical motivation Complex final state Simple initial state Initial state Final state

4D physical i Physical motivation Complex final state Simple initial state Initial state Final state Gravity

4D physical i The ideal scenario: We need a very very very large computer! Not practical! Even with approximations!!!!

4D physical i The ideal scenario: We need a very very very large computer! Not practical! Even with approximations!!!!

4D physical i The ideal scenario: We need a very very very large computer! Not practical! Even with approximations!!!!

4D physical i The ideal scenario: We need a very very very large computer! Not practical! Even with approximations!!!!

4D physical i The ideal scenario: We need a very very very large computer! Not practical! Even with approximations!!!!

4D physical i BORG (Bayesian Origin Reconstruction from Galaxies) HMC Second order Lagrangian perturbation theory BORG Jasche, Wandelt (2012)

4D physical i

4D physical i

Summary & Conclusion Large scale Bayesian i I in high dimensions from incomplete observations Noise, systematic effects, survey geometry, selection effects, biases Need to quantify uncertainties explore posterior distribution Markov Chain Monte Carlo methods Hamiltonian sampling (exploit symmetries, decouple system) Multiple block sampling (break down into subproblems) 3 high dimensional examples (>10^7 parameter) Nonlinear density i Photometric redshift and density i 4D physical i

The End Thank you