Computer intensive statistical methods Lecture 1

Similar documents
Computer intensive statistical methods

Applied Statistics. Monte Carlo Simulations. Troels C. Petersen (NBI) Statistics is merely a quantisation of common sense

Convergence of Random Processes

Introduction to Machine Learning CMU-10701

Monte Carlo-based statistical methods (MASM11/FMS091)

Computer intensive statistical methods

Monte Carlo Methods. Part I: Introduction

Computer Intensive Methods in Mathematical Statistics

Monte Carlo Methods for Computation and Optimization (048715)

Bayesian Methods for Machine Learning

Stat 451 Lecture Notes Monte Carlo Integration

Bayesian Inference. Anders Gorm Pedersen. Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU)

Markov Chain Monte Carlo methods

A Review of Pseudo-Marginal Markov Chain Monte Carlo

Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J.

Answers and expectations

Bayesian Inference and MCMC

Theory of Stochastic Processes 8. Markov chain Monte Carlo

eqr094: Hierarchical MCMC for Bayesian System Reliability

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

Adaptive Monte Carlo Methods for Numerical Integration

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Bayesian Graphical Models

Computational statistics

Lecture 2: From Linear Regression to Kalman Filter and Beyond

STA414/2104 Statistical Methods for Machine Learning II

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Computer Intensive Methods in Mathematical Statistics

Bayesian Inference. Chapter 1. Introduction and basic concepts

Down by the Bayes, where the Watermelons Grow

Introduction to Machine Learning CMU-10701

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

Monte Carlo Integration

Who was Bayes? Bayesian Phylogenetics. What is Bayes Theorem?

Introduction to Bayesian Methods

Bayesian Phylogenetics

State-Space Methods for Inferring Spike Trains from Calcium Imaging

Approximate Bayesian Computation: a simulation based approach to inference

Introduction to Markov Chain Monte Carlo & Gibbs Sampling

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

Bayesian Estimation with Sparse Grids

Monte Carlo in Bayesian Statistics

Machine Learning Techniques for Computer Vision

Spatial Statistics with Image Analysis. Outline. A Statistical Approach. Johan Lindström 1. Lund October 6, 2016

Markov Chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo

VCMC: Variational Consensus Monte Carlo

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9

Introduction to Bayesian methods in inverse problems

MONTE CARLO METHODS. Hedibert Freitas Lopes

an introduction to bayesian inference

Beta statistics. Keywords. Bayes theorem. Bayes rule

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Nested Sampling. Brendon J. Brewer. brewer/ Department of Statistics The University of Auckland

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Adaptive Monte Carlo methods

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

Bayesian GLMs and Metropolis-Hastings Algorithm

Lecture 13 Fundamentals of Bayesian Inference

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.

Principles of Bayesian Inference

Markov Chain Monte Carlo, Numerical Integration

Bayesian Methods with Monte Carlo Markov Chains II

Session 3A: Markov chain Monte Carlo (MCMC)

Departamento de Economía Universidad de Chile

Stat 5421 Lecture Notes Proper Conjugate Priors for Exponential Families Charles J. Geyer March 28, 2016

Eco517 Fall 2014 C. Sims MIDTERM EXAM

Lecture notes on Regression: Markov Chain Monte Carlo (MCMC)

Bayesian Inference in Astronomy & Astrophysics A Short Course

Computer Intensive Methods in Mathematical Statistics

Slide 1 Math 1520, Lecture 21

Bayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017

Integrated Non-Factorized Variational Inference

Bayesian Networks Inference with Probabilistic Graphical Models

Learning the hyper-parameters. Luca Martino

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Stat 516, Homework 1

Infer relationships among three species: Outgroup:

On Bayesian Computation

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Bayesian spatial hierarchical modeling for temperature extremes

Reminder of some Markov Chain properties:

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model

Intelligent Systems I

Density Estimation. Seungjin Choi

36-463/663: Hierarchical Linear Models

Information Science 2

MCMC Sampling for Bayesian Inference using L1-type Priors

Lecture 6: Markov Chain Monte Carlo

INTRODUCTION TO BAYESIAN STATISTICS

10. Exchangeability and hierarchical models Objective. Recommended reading

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Computer Intensive Methods in Mathematical Statistics

arxiv: v1 [stat.me] 30 Sep 2009

Control Variates for Markov Chain Monte Carlo

Session 5B: A worked example EGARCH model

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Stat 451 Lecture Notes Numerical Integration

Transcription:

Computer intensive statistical methods Lecture 1 Jonas Wallin Chalmers, Gothenburg. Jonas Wallin - jonwal@chalmers.se 1/27

People and literature The following people are involved in the course: Function Name Room E-mail Lecturer Jonas Wallin 2112 jonwal@chalmers.se The following material will be used: Slides. Will be available online (around the day) after lecture. Introducing Monte Carlo Methods with R, by Robert, Christian and Casella, George Some course information Jonas Wallin - jonwal@chalmers.se 2/27

Course schedule and homepage The course schedule is as follows: Weekday Time Room Lecture I Tuesday 13.15 15.00 Pascal Computer session I Tuesday 15.15 17.00 MVF522 Office hours Wednesday 10.15 11.00 L2112(My office) Lecture II Thursday 13.15 15.00 Pascal Computer session II Thursday 15.15 17.00 MVF522 The first week the computer lab will be introduction to R. Information and R files will be available at the homepage ( http://www.math.chalmers.se/stat/grundutb/cth/mve186/1415/). Some course information Jonas Wallin - jonwal@chalmers.se 3/27

Examination The examination comprises three larger projects handed out during Weeks 2, 4, and 6. Each project requires the submission of a report. The projects, which are solved in pairs, concern 1 simulation and Monte Carlo integration. 2 Bayesian modeling and inference. 3 Markov chain Monte Carlo methods, and Bayesian modeling and inference. an written exam. There will be bonus points from the projects. Some course information Jonas Wallin - jonwal@chalmers.se 4/27

Course contents Simulation and Monte Carlo integration Bayesian modeling and inference Markov chain Monte Carlo (MCMC) methods other methods like EM algorithm and INLA (if time permits) Course contents Jonas Wallin - jonwal@chalmers.se 5/27

Bayesian statistics Unlike frequentest statistics (think first course in statistics), Bayesian statistics does not consider the parameters fixed but random. Course contents Bayesian statistics Jonas Wallin - jonwal@chalmers.se 6/27

Bayesian statistics Unlike frequentest statistics (think first course in statistics), Bayesian statistics does not consider the parameters fixed but random. Bayesian modelling A Bayesian model consists of A prior, a priori, model for the parameters, Θ, given by the probability density π(θ). A conditional model for data, y, given reality, with density f (y Θ). Course contents Bayesian statistics Jonas Wallin - jonwal@chalmers.se 6/27

Bayesian statistics Unlike frequentest statistics (think first course in statistics), Bayesian statistics does not consider the parameters fixed but random. Bayesian modelling A Bayesian model consists of A prior, a priori, model for the parameters, Θ, given by the probability density π(θ). A conditional model for data, y, given reality, with density f (y Θ). The prior can be expanded into several layers creating a Bayesian hierarchical model. Course contents Bayesian statistics Jonas Wallin - jonwal@chalmers.se 6/27

Bayes Formula How should the prior and likelihood be combined to make inference about Θ, given observations of y? Bayes Formula f (Θ y) = f (y Θ)π(Θ) f (y) = f (y Θ)π(Θ) χ f (y Θ )π(θ ) dθ f (Θ y) is called the posterior, or a posteriori, distribution. Course contents Bayesian statistics Jonas Wallin - jonwal@chalmers.se 7/27

Bayes Formula How should the prior and likelihood be combined to make inference about Θ, given observations of y? Bayes Formula f (Θ y) = f (y Θ)π(Θ) f (y) = f (y Θ)π(Θ) χ f (y Θ )π(θ ) dθ f (Θ y) is called the posterior, or a posteriori, distribution. Often, only the proportionality relation f (Θ y) π(θ, y) = f (y Θ)π(Θ) is needed, when seen as a function of Θ. Course contents Bayesian statistics Jonas Wallin - jonwal@chalmers.se 7/27

Korsbetning In 1361 the Danish king Valdemar Atterdag conquered Gotland and captured the rich Hanseatic town of Visby. In 1929 1930 the graveside was excavated. A total of 493 femurs (237 right, 256 left) were found. How many people where buried there? Course contents Example of Bayesian modeling Jonas Wallin - jonwal@chalmers.se 8/27

Korsbetning Using Bayesian inference, we get: prob 0.0000 0.0005 0.0010 0.0015 500 1000 1500 2000 N Course contents Example of Bayesian modeling Jonas Wallin - jonwal@chalmers.se 9/27

Image recovry Suppose we have an corrupted image, how using Bayesian inference can we recover the image? Course contents Example of Bayesian modeling Jonas Wallin - jonwal@chalmers.se 10/27

Image recovry At what level of corruption can we recover the image p = 0.1? Course contents Example of Bayesian modeling Jonas Wallin - jonwal@chalmers.se 11/27

Image recovry At what level of corruption can we recover the image p = 0.3? Course contents Example of Bayesian modeling Jonas Wallin - jonwal@chalmers.se 11/27

Image recovry At what level of corruption can we recover the image p = 0.6? Course contents Example of Bayesian modeling Jonas Wallin - jonwal@chalmers.se 11/27

Image recovry At what level of corruption can we recover the image p = 0.9? Course contents Example of Bayesian modeling Jonas Wallin - jonwal@chalmers.se 11/27

Change of intensity, Poisson Processes The figure shows the cumulative sum number of coal mining accidents for the years 1851 to 1963. acident number 0 50 100 150 1860 1880 1900 1920 1940 1960 year Course contents Example of Bayesian modeling Jonas Wallin - jonwal@chalmers.se 12/27

Mixture models, nonparametric Bayesian Figure : Old Faithful Old Faithful eruptions Density 0.00 0.01 0.02 0.03 0.04 40 50 60 70 80 90 100 Minutes Course contents Example of Bayesian modeling Jonas Wallin - jonwal@chalmers.se 13/27

Inference, parameter estimation It is easy to setup the models for the above presented examples, but how do you make inference for the models. Course contents MCMC Jonas Wallin - jonwal@chalmers.se 14/27

Inference, parameter estimation It is easy to setup the models for the above presented examples, but how do you make inference for the models. The major tool for estimation, prediction and prediction for complex models are Markov Chain Monte Carlo (MCMC ) methods. Course contents MCMC Jonas Wallin - jonwal@chalmers.se 14/27

Inference, parameter estimation It is easy to setup the models for the above presented examples, but how do you make inference for the models. The major tool for estimation, prediction and prediction for complex models are Markov Chain Monte Carlo (MCMC ) methods. We study how, why the work, and also how to use them to make inference of the examples presented above. Course contents MCMC Jonas Wallin - jonwal@chalmers.se 14/27

The prinicple aim of MC simulation The main problem of this course is to compute some expectation τ = E[h(X )] = h(x)f (x) dx, where X is a random variable taking values in χ, χ Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 15/27

The prinicple aim of MC simulation The main problem of this course is to compute some expectation τ = E[h(X )] = h(x)f (x) dx, where X is a random variable taking values in χ, f : χ R + is the probability density of X, and χ Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 15/27

The prinicple aim of MC simulation The main problem of this course is to compute some expectation τ = E[h(X )] = h(x)f (x) dx, where X is a random variable taking values in χ, f : χ R + is the probability density of X, and h : χ R is a function s.t that the expectation above is finite. χ Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 15/27

The prinicple aim of MC simulation The main problem of this course is to compute some expectation τ = E[h(X )] = h(x)f (x) dx, where X is a random variable taking values in χ, f : χ R + is the probability density of X, and χ h : χ R is a function s.t that the expectation above is finite. This might seem like a very limited problem, however as we will se the problem covers a large sets of problem in statistics and scientific modeling. Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 15/27

The curse of dimensionality Most numerical methods are accurate to an order of O(N c/d ), where N is the number of function evaluations used to approximate the integral, and c > 0 is a constant depending on numerical method. For example the trapezoidal method c = 2. Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 16/27

The curse of dimensionality Most numerical methods are accurate to an order of O(N c/d ), where N is the number of function evaluations used to approximate the integral, and c > 0 is a constant depending on numerical method. For example the trapezoidal method c = 2. Thus the error of our numerical approximation τ N of the integral is ɛ N = τ τ N CN c/d, where C > 0 is a constant depending of the function. To guarantee that the error should be less then δ, N must satisfy Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 16/27

The curse of dimensionality Most numerical methods are accurate to an order of O(N c/d ), where N is the number of function evaluations used to approximate the integral, and c > 0 is a constant depending on numerical method. For example the trapezoidal method c = 2. Thus the error of our numerical approximation τ N of the integral is ɛ N = τ τ N CN c/d, where C > 0 is a constant depending of the function. To guarantee that the error should be less then δ, N must satisfy CN c/d δ c d log(n) log( δ C ) Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 16/27

The curse of dimensionality Most numerical methods are accurate to an order of O(N c/d ), where N is the number of function evaluations used to approximate the integral, and c > 0 is a constant depending on numerical method. For example the trapezoidal method c = 2. Thus the error of our numerical approximation τ N of the integral is ɛ N = τ τ N CN c/d, where C > 0 is a constant depending of the function. To guarantee that the error should be less then δ, N must satisfy CN c/d δ c d log(n) log( δ C ) N e d c log( δ C ) This means that for a fixed error the number of functions evaluation grows exponentially with the dimension of the problem. Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 16/27

The Monte Carlo (MC) method in a nutshell Theorem (law of large numbers*) Let X 1, X 2,..., X N be independent random variables with density f. Then, if V[h(X )] < as N tends to infinity def. τ N = 1 N N h(x i ) E(h(X )). i=1 Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 17/27

The Monte Carlo (MC) method in a nutshell Theorem (law of large numbers*) Let X 1, X 2,..., X N be independent random variables with density f. Then, if V[h(X )] < as N tends to infinity def. τ N = 1 N N h(x i ) E(h(X )). i=1 Inspired by this result, we formulate the following basic MC sampler (Stanis lav Ulam, John von Neumann, and Nicholas Metropolis; the Los Alamos Scientific Laboratory; 40 s): for i = 1 n do draw X i f end for set τ N n i=1 h(x i)/n return τ N Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 17/27

The first thoughts and attempts I made to practice [the Monte Carlo method] were suggested by a question which occurred to me in 1946 as I was convalescing from an illness and playing solitaires. The question was what are the chances that a Canfield solitaire laid out with 52 cards will come out successfully? After spending a lot of time trying to estimate them by pure combinatorial calculations, I wondered whether a more practical method than abstract thinking might not be to lay it out say one hundred times and simply observe and count the number of successful plays. This was already possible to envisage with the beginning of the new era of fast computers, and I immediately thought of problems of neutron diffusion and other questions of mathematical physics, and more generally how to change processes described by certain differential equations into an equivalent form interpretable as a succession of random operations. Later [in 1946], I described the idea to John von Neumann, and we began to plan actual calculations. Stanis lav Ulam Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 18/27

Example: Integration The problem of computing an integral of form (0,1) d h(x) dx can be cast into our framework by letting { χ (0, 1) d f I(0, 1) d (= unif(0, 1) d ). Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 19/27

Example: Integration (cont.) As an example for d = 1, let h(x) = sin 2 (1/ cos(log(1 + 2πx))): h(x) 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 x Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 20/27

Example: Integration (cont.) As an example for d = 1, let h(x) = sin 2 (1/ cos(log(1 + 2πx))): N <- 1000 h <- function(x){sin(1/(cos(log(1 + 2*pi*x))))^2} tau_n <- mean(h(runif(n))) Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 21/27

Example: Integration (cont.) tau 0.1 0.2 0.3 0.4 0.5 0.6 0 200 400 600 800 1000 iter Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 22/27

Rate of convergence of MC So what about the rate of convergence? Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 23/27

Rate of convergence of MC So what about the rate of convergence? For the MC method, the error is random. However, the central limit theorem implies, under the assumption that V(h(X )) <, d. N (τn τ) N (0, V(h(X ))). This means that for large N s, ( ) V N (τn τ) = NV (τ N τ) V(h(X )), implying that D (τ N τ) def. = V (τ N τ) V(h(X )) N = D(h(X )) N. Thus, the MC convergence rate O(N 1/2 ) is independent of d! Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 23/27

central limit theorem example CLT implies that N V(h(X )) (τ N τ) should be almost N (0, 1) for large N. Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 24/27

central limit theorem example CLT implies that N V(h(X )) (τ N τ) should be almost N (0, 1) for large N. So lets examine this for for our example above, by running 20000 independent repetitions of the Monte Carlo simulation. Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 24/27

central limit theorem example N = 1 Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 25/27

central limit theorem example N = 2 Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 25/27

central limit theorem example N = 3 Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 25/27

central limit theorem example N = 4 Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 25/27

central limit theorem example N = 5 Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 25/27

central limit theorem example N = 10 Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 25/27

central limit theorem example N = 20 Overview of the Monte Carlo method Jonas Wallin - jonwal@chalmers.se 25/27

What do we need to know? OK, so what do we need to master for having practical use of the MC method? What s next? Jonas Wallin - jonwal@chalmers.se 26/27

What do we need to know? OK, so what do we need to master for having practical use of the MC method? Well, for instance, the following questions should be answered: 1: How do we generate the needed input random variables? 2: How many computer experiments should we do? What can be said about the error? 3: Can we exploit problem structure to speed up the computation? What s next? Jonas Wallin - jonwal@chalmers.se 26/27

Next lecture Next time we will deal with the first two issues and discuss Pseudo-random number generation and MC output analysis. See you! What s next? Jonas Wallin - jonwal@chalmers.se 27/27