Practical Numerical Methods in Physics and Astronomy. Lecture 5 Optimisation and Search Techniques

Size: px
Start display at page:

Download "Practical Numerical Methods in Physics and Astronomy. Lecture 5 Optimisation and Search Techniques"

Transcription

1 Practical Numerical Methods in Physics and Astronomy Lecture 5 Optimisation and Search Techniques Pat Scott Department of Physics, McGill University January 30, 2013 Slides available from patscott

2 Outline 1 General Considerations 2

3 Outline 1 General Considerations 2

4 General Considerations Optimisation - the problem Optimisation is finding global minima and maxima for what x = ~xneedle does min[fhaystack (~x )] = f (xneedle )? maximisation is just minimisation of fhaystack (~x )

5 General Considerations Optimisation - the problem Optimisation is finding global minima and maxima for what x = ~xneedle does min[fhaystack (~x )] = f (xneedle )? maximisation is just minimisation of fhaystack (~x ) usually everything is posed as minimisation

6 The general strategy To optimise, we always require an objective or fitness function f haystack Any search problem can be posed in terms of some sort of fitness function We may care just about finding x needle or about mapping f haystack in the region of x needle e.g. comparing theory to data - just best fit parameters? - or errors on best-fit? - or just a good overall map, without even finding the exact best fit? goal is global minimum, but often result is local minima

7 Optimisation vs root finding multi-d optimisation usually easier than multi-d root finding optimisation by root finding on f haystack = 0 doesn t work - makes all local minima and maxima (and pts of inflection!) degenerate - highly unlikely to find the global extremum root finding for h( x) = 0 by minimisation of h 2 ( x) is not enough - generally run into problems with local minima - can be improved by combination with Newton s method in multi-d

8 Options: deterministic, non-gradient methods - Brent s method in 1D - downhill simplex in multi-d deterministic, gradient-based methods - steepest descent stochastic, gradient-inspired methods - MCMCs - nested sampling - simulated annealing stochastic, non-gradient methods - genetic algorithms - differential evolution many others...

9 Options: deterministic, non-gradient methods - Brent s method in 1D - downhill simplex in multi-d deterministic, gradient-based methods - steepest descent stochastic, gradient-inspired methods - MCMCs - nested sampling - simulated annealing stochastic, non-gradient methods - genetic algorithms - differential evolution many others...

10 Options: deterministic, non-gradient methods - Brent s method in 1D - downhill simplex in multi-d deterministic, gradient-based methods - steepest descent stochastic, gradient-inspired methods - MCMCs - nested sampling - simulated annealing stochastic, non-gradient methods - genetic algorithms - differential evolution many others...

11 Outline General Considerations 1 General Considerations 2

12 Outline General Considerations 1 General Considerations 2

13 Synopsis: Bracket the minimum with 3 points and use Brent s usual tricks parabola through parabola through Tracks 6 individual pts always 2 brackets and a third pt lower than brackets quadratic interpolation + bisection similar point ID shuffling to root-finding version similar conditions for interpolation step can be improved with derivative information

14 Outline General Considerations 1 General Considerations 2

15 Synopsis: Follow the gradient until you hit a local (line) minimum; reassess. Always need to hang a 90 left or right Works (for local min), but inefficient Requires 1D minimisation routine (e.g. Brent s)

16 Variants on steepest descent General idea of line minimisation can be improved Improvement from better directional basis set direction set methods - many ways to choose basis - goal is to choose directions s.t. successive line minimisations don t interfere Still uses 1D minimisation along a line requires Brent s or similar

17 Outline General Considerations 1 General Considerations 2

18 The downhill simplex method Synopsis: Ooze down the slope and around corners like a blob of goo (or an amoeba) short, simple, fun, effective any dimension no brackets, derivatives or line minimisation required still only good for local minima

19 1 Evaluate f ( x) at corners 2 Find worst-fit corner high low simplex at beginning of step

20 1 Evaluate f ( x) at corners 2 Find worst-fit corner 3 Replace worst-fit corner with new point reflected across remaining points high simplex at beginning of step low Let's try this! reflection

21 1 Evaluate f ( x) at corners 2 Find worst-fit corner 3 Replace worst-fit corner with new point reflected across remaining points 4 If new point is awesome*, extend simplex as well high That was awesome! simplex at beginning of step low Let's try this! reflection reflection and expansion *awesome = better than best-fit NEXT STEP

22 1 Evaluate f ( x) at corners 2 Find worst-fit corner 3 Replace worst-fit corner with new point reflected across remaining points 4 If new point is awesome*, extend simplex as well high That was awesome! simplex at beginning of step low Let's try this! reflection reflection and expansion That was OK. *awesome = better than best-fit terrible worse than second-worst fit OK = in between NEXT STEP

23 1 Evaluate f ( x) at corners 2 Find worst-fit corner 3 Replace worst-fit corner with new point reflected across remaining points 4 If new point is awesome*, extend simplex as well 5 If new point is terrible, discard it and try 1D contraction instead simplex at beginning of step high low Let's try this! reflection That was crap. That was awesome! contraction reflection and expansion That was OK. *awesome = better than best-fit terrible worse than second-worst fit OK = in between NEXT STEP

24 1 Evaluate f ( x) at corners 2 Find worst-fit corner 3 Replace worst-fit corner with new point reflected across remaining points 4 If new point is awesome*, extend simplex as well 5 If new point is terrible, discard it and try 1D contraction instead 6 If 1D contraction is also terrible, do multi-d contraction about best corner *awesome = better than best-fit terrible worse than second-worst fit OK = in between high That was awesome! reflection and expansion simplex at beginning of step low Let's try this! NEXT STEP reflection That was OK. That was crap. contraction That was crap too. multiple contraction

25 Outline General Considerations 1 General Considerations 2

26 s Synopsis: Jump around like a particle diffusing down a gradient

27 s Synopsis: Jump around like a particle diffusing down a gradient Biased random walk Trotta s example: like an elephant on the savannah looking for water

28 s Synopsis: Jump around like a particle diffusing down a gradient Biased random walk Trotta s example: like an elephant on the savannah looking for water - wanders randomly until it finds a few puddles

29 s Synopsis: Jump around like a particle diffusing down a gradient Biased random walk Trotta s example: like an elephant on the savannah looking for water - wanders randomly until it finds a few puddles - moves generally and stochastically around the surrounding area until it sights a bigger puddle

30 s Synopsis: Jump around like a particle diffusing down a gradient Biased random walk Trotta s example: like an elephant on the savannah looking for water - wanders randomly until it finds a few puddles - moves generally and stochastically around the surrounding area until it sights a bigger puddle - in doing so, moves on average in the direction n puddles -... until it finds a stream...

31 s Synopsis: Jump around like a particle diffusing down a gradient Biased random walk Trotta s example: like an elephant on the savannah looking for water - wanders randomly until it finds a few puddles - moves generally and stochastically around the surrounding area until it sights a bigger puddle - in doing so, moves on average in the direction n puddles -... until it finds a stream... - follows stream to jackpot

32 Definition 1 Monte Carlo: direct simulation of some stochastic process by drawing repeated samples from a known distribution Definition 2 Markov Chain: a string of system states / samples where each state depends only on the previous one Definition 3 : a Monte Carlo sampling from a distribution where each new sample is drawn with some reference to the last

33 Metropolis-Hastings sampling One particular sampling scheme for generating Markov Chains Best known, has nice statistical properties (more later) Randomly generate a new proposed point x maybe Test if f haystack ( x maybe ) < f haystack ( x current ) If so, x maybe x new If not, choose x maybe x new with probability f haystack ( x current )/f haystack ( x maybe )... and x current x new with probability [1 f haystack ( x current )/f haystack ( x maybe )]

34 Proposal functions General Considerations Q How do you generate the proposed point?

35 Proposal functions General Considerations Q How do you generate the proposed point? A You need a proposal function P( x) Generally some local distribution centred on the current point

36 Proposal functions General Considerations Q How do you generate the proposed point? A You need a proposal function P( x) Generally some local distribution centred on the current point e.g. a product of Gaussians in every direction

37 Proposal functions General Considerations Q How do you generate the proposed point? A You need a proposal function P( x) Generally some local distribution centred on the current point e.g. a product of Gaussians in every direction or a multi-d Gaussian pdf (not the same thing!!)

38 Proposal functions & burn-in Ideally, P = f haystack in vicinity of x current not usually practical P should be chosen adaptively to get the best approximation to f haystack - e.g. by analysing previous points and adjusting σ for a Gaussian P After a suitable number steps, memory of starting point is gone this is the burn-in period; all points during burn-in should be discarded proposal function may be fixed after burn-in (more later)

39 MCMC step by step (for minimization) 1 Initialise P 2 Choose a random starting point z 3 Take a Metropolis-Hastings step a. Choose a proposal point y from P b. If α f ( z) f ( y) 1 accept y as the new z c. Otherwise (α < 1), generate a random uniform deviate β d. If β < α, accept y as the new z e. Otherwise, z remains the same 4 If burn-in is still going, adjust P (usually on the basis of previous points) 5 If burn-in is finished, test for convergence 6 Repeat from Step 3

40 Statistical features of MCMCs Bayesians love MCMCs... MCMC procedure ensures that density of points in chain is proportional to the value of f haystack Makes marginalising (integrating) over uninteresting parameters easy - just sum the number of points Must fix the proposal function to have this property = extra-important to throw out burn-in points MCMCs and similar algorithms can also be good for frequentists Don t fix the proposal function let it keep optimising itself on the go to find the global minimum Use a very strict convergence criterion

41 Convergence General Considerations Local minima are an issue Easy to get stuck if local mode is wider/deeper than proposal function Need to use multiple chains with different starting values, and combine results Convergence criteria Coarsest option is to test the variance σrunning 2 of the last few points in the chain - σrunning 2 < σ2 threshold = chain found a minimum (local/global) - Very rough but OK(ish) if you know f haystack is unimodal Can be defined in terms of fractional change in Bayesian evidence ( f ( x)d x) Many other more sophisticated schemes Some ppl use a constant length of chain this can be risky

42 A couple of other random points... Temperature Chains can be assigned different temperatures s.t. α f ( z) f ( y) exp(ln f ( z) ln f ( y) T ) = exp(ln f ( z)/t ) exp(ln f ( y)/t ) = T > 1 = for α < 1, α goes up vs. normal MCMC = steps are more easily accepted like giving the jumpy diffusive particle a higher T allows skipping over local minima more easily Combining chains with different T breaks statistical properties (I think) Alternative sampling methods Gibbs sampling, slice sampling, others... ( ) 1 f ( z) T f ( y)

43 A few MCMC examples in research m 0 (TeV) m (TeV) 1/2 Putze et al (2010)

44 When to use which method?... as always, this is problem-specific... make sure to try a few For 1D where you can bracket, Brent s is best For multi-d with few modes, direction set type methods do OK For multi-d with many modes, and/or badly-behaved f, need MCMC/MultiNest/GAs

45 Housekeeping General Considerations Next lecture: Monday Feb 4 Numerical Integration I

Lecture 34 Minimization and maximization of functions

Lecture 34 Minimization and maximization of functions Lecture 34 Minimization and maximization of functions Introduction Golden section search Parabolic interpolation Search with first derivatives Downhill simplex method Introduction In a nutshell, you are

More information

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester Physics 403 Numerical Methods, Maximum Likelihood, and Least Squares Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Quadratic Approximation

More information

Lecture 6: Markov Chain Monte Carlo

Lecture 6: Markov Chain Monte Carlo Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline

More information

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006 Astronomical p( y x, I) p( x, I) p ( x y, I) = p( y, I) Data Analysis I Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK 10 lectures, beginning October 2006 4. Monte Carlo Methods

More information

Doing Bayesian Integrals

Doing Bayesian Integrals ASTR509-13 Doing Bayesian Integrals The Reverend Thomas Bayes (c.1702 1761) Philosopher, theologian, mathematician Presbyterian (non-conformist) minister Tunbridge Wells, UK Elected FRS, perhaps due to

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

An introduction to Markov Chain Monte Carlo techniques

An introduction to Markov Chain Monte Carlo techniques An introduction to Markov Chain Monte Carlo techniques G. J. A. Harker University of Colorado ASTR5550, 19th March 2012 Outline Introduction Bayesian inference: recap MCMC: when to use it and why A simple

More information

Multimodal Nested Sampling

Multimodal Nested Sampling Multimodal Nested Sampling Farhan Feroz Astrophysics Group, Cavendish Lab, Cambridge Inverse Problems & Cosmology Most obvious example: standard CMB data analysis pipeline But many others: object detection,

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revised on April 24, 2017 Today we are going to learn... 1 Markov Chains

More information

Today. Introduction to optimization Definition and motivation 1-dimensional methods. Multi-dimensional methods. General strategies, value-only methods

Today. Introduction to optimization Definition and motivation 1-dimensional methods. Multi-dimensional methods. General strategies, value-only methods Optimization Last time Root inding: deinition, motivation Algorithms: Bisection, alse position, secant, Newton-Raphson Convergence & tradeos Eample applications o Newton s method Root inding in > 1 dimension

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

Lecture 35 Minimization and maximization of functions. Powell s method in multidimensions Conjugate gradient method. Annealing methods.

Lecture 35 Minimization and maximization of functions. Powell s method in multidimensions Conjugate gradient method. Annealing methods. Lecture 35 Minimization and maximization of functions Powell s method in multidimensions Conjugate gradient method. Annealing methods. We know how to minimize functions in one dimension. If we start at

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Markov chain Monte Carlo (MCMC) Gibbs and Metropolis Hastings Slice sampling Practical details Iain Murray http://iainmurray.net/ Reminder Need to sample large, non-standard distributions:

More information

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 13-28 February 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Limitations of Gibbs sampling. Metropolis-Hastings algorithm. Proof

More information

MCMC Sampling for Bayesian Inference using L1-type Priors

MCMC Sampling for Bayesian Inference using L1-type Priors MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling

More information

15-889e Policy Search: Gradient Methods Emma Brunskill. All slides from David Silver (with EB adding minor modificafons), unless otherwise noted

15-889e Policy Search: Gradient Methods Emma Brunskill. All slides from David Silver (with EB adding minor modificafons), unless otherwise noted 15-889e Policy Search: Gradient Methods Emma Brunskill All slides from David Silver (with EB adding minor modificafons), unless otherwise noted Outline 1 Introduction 2 Finite Difference Policy Gradient

More information

Lecture notes on Regression: Markov Chain Monte Carlo (MCMC)

Lecture notes on Regression: Markov Chain Monte Carlo (MCMC) Lecture notes on Regression: Markov Chain Monte Carlo (MCMC) Dr. Veselina Kalinova, Max Planck Institute for Radioastronomy, Bonn Machine Learning course: the elegant way to extract information from data,

More information

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy

More information

Design and Optimization of Energy Systems Prof. C. Balaji Department of Mechanical Engineering Indian Institute of Technology, Madras

Design and Optimization of Energy Systems Prof. C. Balaji Department of Mechanical Engineering Indian Institute of Technology, Madras Design and Optimization of Energy Systems Prof. C. Balaji Department of Mechanical Engineering Indian Institute of Technology, Madras Lecture - 09 Newton-Raphson Method Contd We will continue with our

More information

Approximate inference in Energy-Based Models

Approximate inference in Energy-Based Models CSC 2535: 2013 Lecture 3b Approximate inference in Energy-Based Models Geoffrey Hinton Two types of density model Stochastic generative model using directed acyclic graph (e.g. Bayes Net) Energy-based

More information

Markov chain Monte Carlo Lecture 9

Markov chain Monte Carlo Lecture 9 Markov chain Monte Carlo Lecture 9 David Sontag New York University Slides adapted from Eric Xing and Qirong Ho (CMU) Limitations of Monte Carlo Direct (unconditional) sampling Hard to get rare events

More information

6. Advanced Numerical Methods. Monte Carlo Methods

6. Advanced Numerical Methods. Monte Carlo Methods 6. Advanced Numerical Methods Part 1: Part : Monte Carlo Methods Fourier Methods Part 1: Monte Carlo Methods 1. Uniform random numbers Generating uniform random numbers, drawn from the pdf U[0,1], is fairly

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C.

Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C. Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C. Spall John Wiley and Sons, Inc., 2003 Preface... xiii 1. Stochastic Search

More information

Lecture 7: Minimization or maximization of functions (Recipes Chapter 10)

Lecture 7: Minimization or maximization of functions (Recipes Chapter 10) Lecture 7: Minimization or maximization of functions (Recipes Chapter 10) Actively studied subject for several reasons: Commonly encountered problem: e.g. Hamilton s and Lagrange s principles, economics

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017 Markov Chain Monte Carlo (MCMC) and Model Evaluation August 15, 2017 Frequentist Linking Frequentist and Bayesian Statistics How can we estimate model parameters and what does it imply? Want to find the

More information

Markov Chain Monte Carlo The Metropolis-Hastings Algorithm

Markov Chain Monte Carlo The Metropolis-Hastings Algorithm Markov Chain Monte Carlo The Metropolis-Hastings Algorithm Anthony Trubiano April 11th, 2018 1 Introduction Markov Chain Monte Carlo (MCMC) methods are a class of algorithms for sampling from a probability

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization Blackbox Optimization Marc Toussaint U Stuttgart Blackbox Optimization The term is not really well defined I use it to express that only f(x) can be evaluated f(x) or 2 f(x)

More information

A Beginner s Guide to MCMC

A Beginner s Guide to MCMC A Beginner s Guide to MCMC David Kipping Sagan Workshop 2016 but first, Sagan workshops, symposiums and fellowships are the bomb how to get the most out of a Sagan workshop, 2009-style lunch with Saganites

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Session 3A: Markov chain Monte Carlo (MCMC)

Session 3A: Markov chain Monte Carlo (MCMC) Session 3A: Markov chain Monte Carlo (MCMC) John Geweke Bayesian Econometrics and its Applications August 15, 2012 ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Chapter 10. Optimization Simulated annealing

Chapter 10. Optimization Simulated annealing Chapter 10 Optimization In this chapter we consider a very different kind of problem. Until now our prototypical problem is to compute the expected value of some random variable. We now consider minimization

More information

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

Development of Stochastic Artificial Neural Networks for Hydrological Prediction Development of Stochastic Artificial Neural Networks for Hydrological Prediction G. B. Kingston, M. F. Lambert and H. R. Maier Centre for Applied Modelling in Water Engineering, School of Civil and Environmental

More information

Numerical Optimization: Basic Concepts and Algorithms

Numerical Optimization: Basic Concepts and Algorithms May 27th 2015 Numerical Optimization: Basic Concepts and Algorithms R. Duvigneau R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 1 Outline Some basic concepts in optimization Some

More information

16 : Markov Chain Monte Carlo (MCMC)

16 : Markov Chain Monte Carlo (MCMC) 10-708: Probabilistic Graphical Models 10-708, Spring 2014 16 : Markov Chain Monte Carlo MCMC Lecturer: Matthew Gormley Scribes: Yining Wang, Renato Negrinho 1 Sampling from low-dimensional distributions

More information

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2015 http://www.astro.cornell.edu/~cordes/a6523 Lecture 23:! Nonlinear least squares!! Notes Modeling2015.pdf on course

More information

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 18, 2015 1 / 45 Resources and Attribution Image credits,

More information

Motivation: We have already seen an example of a system of nonlinear equations when we studied Gaussian integration (p.8 of integration notes)

Motivation: We have already seen an example of a system of nonlinear equations when we studied Gaussian integration (p.8 of integration notes) AMSC/CMSC 460 Computational Methods, Fall 2007 UNIT 5: Nonlinear Equations Dianne P. O Leary c 2001, 2002, 2007 Solving Nonlinear Equations and Optimization Problems Read Chapter 8. Skip Section 8.1.1.

More information

Quantifying Uncertainty

Quantifying Uncertainty Sai Ravela M. I. T Last Updated: Spring 2013 1 Markov Chain Monte Carlo Monte Carlo sampling made for large scale problems via Markov Chains Monte Carlo Sampling Rejection Sampling Importance Sampling

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Root Finding and Optimization

Root Finding and Optimization Root Finding and Optimization Ramses van Zon SciNet, University o Toronto Scientiic Computing Lecture 11 February 11, 2014 Root Finding It is not uncommon in scientiic computing to want solve an equation

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

Markov Networks.

Markov Networks. Markov Networks www.biostat.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts Markov network syntax Markov network semantics Potential functions Partition function

More information

Lecture 8: Policy Gradient

Lecture 8: Policy Gradient Lecture 8: Policy Gradient Hado van Hasselt Outline 1 Introduction 2 Finite Difference Policy Gradient 3 Monte-Carlo Policy Gradient 4 Actor-Critic Policy Gradient Introduction Vapnik s rule Never solve

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

CS 343: Artificial Intelligence

CS 343: Artificial Intelligence CS 343: Artificial Intelligence Bayes Nets: Sampling Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.

More information

Eco517 Fall 2013 C. Sims MCMC. October 8, 2013

Eco517 Fall 2013 C. Sims MCMC. October 8, 2013 Eco517 Fall 2013 C. Sims MCMC October 8, 2013 c 2013 by Christopher A. Sims. This document may be reproduced for educational and research purposes, so long as the copies contain this notice and are retained

More information

Doing Physics with Random Numbers

Doing Physics with Random Numbers Doing Physics with Random Numbers Andrew J. Schultz Department of Chemical and Biological Engineering University at Buffalo The State University of New York Concepts Random numbers can be used to measure

More information

MCMC and Gibbs Sampling. Kayhan Batmanghelich

MCMC and Gibbs Sampling. Kayhan Batmanghelich MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction

More information

Exploring the energy landscape

Exploring the energy landscape Exploring the energy landscape ChE210D Today's lecture: what are general features of the potential energy surface and how can we locate and characterize minima on it Derivatives of the potential energy

More information

Brief introduction to Markov Chain Monte Carlo

Brief introduction to Markov Chain Monte Carlo Brief introduction to Department of Probability and Mathematical Statistics seminar Stochastic modeling in economics and finance November 7, 2011 Brief introduction to Content 1 and motivation Classical

More information

The Metropolis-Hastings Algorithm. June 8, 2012

The Metropolis-Hastings Algorithm. June 8, 2012 The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization Konstantin Tretyakov (kt@ut.ee) MTAT.03.227 Machine Learning So far Machine learning is important and interesting The general concept: Fitting models to data So far Machine

More information

1 The best of all possible worlds

1 The best of all possible worlds Notes for 2017-03-18 1 The best of all possible worlds Last time, we discussed three methods of solving f(x) = 0: Newton, modified Newton, and bisection. Newton is potentially faster than bisection; bisection

More information

Markov chain Monte Carlo

Markov chain Monte Carlo 1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

A.I.: Beyond Classical Search

A.I.: Beyond Classical Search A.I.: Beyond Classical Search Random Sampling Trivial Algorithms Generate a state randomly Random Walk Randomly pick a neighbor of the current state Both algorithms asymptotically complete. Overview Previously

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Local Search & Optimization

Local Search & Optimization Local Search & Optimization CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 4 Some

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Signal Modeling, Statistical Inference and Data Mining in Astrophysics

Signal Modeling, Statistical Inference and Data Mining in Astrophysics ASTRONOMY 6523 Spring 2013 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Course Approach The philosophy of the course reflects that of the instructor, who takes a dualistic view

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods Prof. Daniel Cremers 11. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

CPSC 340: Machine Learning and Data Mining. Regularization Fall 2017

CPSC 340: Machine Learning and Data Mining. Regularization Fall 2017 CPSC 340: Machine Learning and Data Mining Regularization Fall 2017 Assignment 2 Admin 2 late days to hand in tonight, answers posted tomorrow morning. Extra office hours Thursday at 4pm (ICICS 246). Midterm

More information

CSC 446 Notes: Lecture 13

CSC 446 Notes: Lecture 13 CSC 446 Notes: Lecture 3 The Problem We have already studied how to calculate the probability of a variable or variables using the message passing method. However, there are some times when the structure

More information

Monte Carlo methods for sampling-based Stochastic Optimization

Monte Carlo methods for sampling-based Stochastic Optimization Monte Carlo methods for sampling-based Stochastic Optimization Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Joint works with B. Jourdain, T. Lelièvre, G. Stoltz from ENPC and E. Kuhn from

More information

Local Search & Optimization

Local Search & Optimization Local Search & Optimization CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 4 Outline

More information

Lecture 6: Monte-Carlo methods

Lecture 6: Monte-Carlo methods Miranda Holmes-Cerfon Applied Stochastic Analysis, Spring 2015 Lecture 6: Monte-Carlo methods Readings Recommended: handout on Classes site, from notes by Weinan E, Tiejun Li, and Eric Vanden-Eijnden Optional:

More information

18 : Advanced topics in MCMC. 1 Gibbs Sampling (Continued from the last lecture)

18 : Advanced topics in MCMC. 1 Gibbs Sampling (Continued from the last lecture) 10-708: Probabilistic Graphical Models 10-708, Spring 2014 18 : Advanced topics in MCMC Lecturer: Eric P. Xing Scribes: Jessica Chemali, Seungwhan Moon 1 Gibbs Sampling (Continued from the last lecture)

More information

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Monte Carlo Inference Methods

Monte Carlo Inference Methods Monte Carlo Inference Methods Iain Murray University of Edinburgh http://iainmurray.net Monte Carlo and Insomnia Enrico Fermi (1901 1954) took great delight in astonishing his colleagues with his remarkably

More information

Statistics and Prediction. Tom

Statistics and Prediction. Tom Statistics and Prediction Tom Kitching tdk@roe.ac.uk @tom_kitching David Tom David Tom photon What do we want to measure How do we measure Statistics of measurement Cosmological Parameter Extraction Understanding

More information

Lecture 8 HASHING!!!!!

Lecture 8 HASHING!!!!! Lecture 8 HASHING!!!!! Announcements HW3 due Friday! HW4 posted Friday! Q: Where can I see examples of proofs? Lecture Notes CLRS HW Solutions Office hours: lines are long L Solutions: We will be (more)

More information

LECTURE 15 Markov chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte

More information

Advanced Statistical Modelling

Advanced Statistical Modelling Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1

More information

4452 Mathematical Modeling Lecture 16: Markov Processes

4452 Mathematical Modeling Lecture 16: Markov Processes Math Modeling Lecture 16: Markov Processes Page 1 4452 Mathematical Modeling Lecture 16: Markov Processes Introduction A stochastic model is one in which random effects are incorporated into the model.

More information

Lect4: Exact Sampling Techniques and MCMC Convergence Analysis

Lect4: Exact Sampling Techniques and MCMC Convergence Analysis Lect4: Exact Sampling Techniques and MCMC Convergence Analysis. Exact sampling. Convergence analysis of MCMC. First-hit time analysis for MCMC--ways to analyze the proposals. Outline of the Module Definitions

More information

Bayesian phylogenetics. the one true tree? Bayesian phylogenetics

Bayesian phylogenetics. the one true tree? Bayesian phylogenetics Bayesian phylogenetics the one true tree? the methods we ve learned so far try to get a single tree that best describes the data however, they admit that they don t search everywhere, and that it is difficult

More information

The Not-Formula Book for C1

The Not-Formula Book for C1 Not The Not-Formula Book for C1 Everything you need to know for Core 1 that won t be in the formula book Examination Board: AQA Brief This document is intended as an aid for revision. Although it includes

More information

Report due date. Please note: report has to be handed in by Monday, May 16 noon.

Report due date. Please note: report has to be handed in by Monday, May 16 noon. Lecture 23 18.86 Report due date Please note: report has to be handed in by Monday, May 16 noon. Course evaluation: Please submit your feedback (you should have gotten an email) From the weak form to finite

More information

1 Probabilities. 1.1 Basics 1 PROBABILITIES

1 Probabilities. 1.1 Basics 1 PROBABILITIES 1 PROBABILITIES 1 Probabilities Probability is a tricky word usually meaning the likelyhood of something occuring or how frequent something is. Obviously, if something happens frequently, then its probability

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

CONTENTS. Preface List of Symbols and Notation

CONTENTS. Preface List of Symbols and Notation CONTENTS Preface List of Symbols and Notation xi xv 1 Introduction and Review 1 1.1 Deterministic and Stochastic Models 1 1.2 What is a Stochastic Process? 5 1.3 Monte Carlo Simulation 10 1.4 Conditional

More information

References. Markov-Chain Monte Carlo. Recall: Sampling Motivation. Problem. Recall: Sampling Methods. CSE586 Computer Vision II

References. Markov-Chain Monte Carlo. Recall: Sampling Motivation. Problem. Recall: Sampling Methods. CSE586 Computer Vision II References Markov-Chain Monte Carlo CSE586 Computer Vision II Spring 2010, Penn State Univ. Recall: Sampling Motivation If we can generate random samples x i from a given distribution P(x), then we can

More information

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Statistical Methods in Particle Physics Lecture 1: Bayesian methods Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Chapter Usual types of questions Tips What can go ugly. and, common denominator will be

Chapter Usual types of questions Tips What can go ugly. and, common denominator will be C3 Cheat Sheet Chapter Usual types of questions Tips What can go ugly 1 Algebraic Almost always adding or subtracting Factorise everything in each fraction first. e.g. If denominators Blindly multiplying

More information

MCMC Methods: Gibbs and Metropolis

MCMC Methods: Gibbs and Metropolis MCMC Methods: Gibbs and Metropolis Patrick Breheny February 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/30 Introduction As we have seen, the ability to sample from the posterior distribution

More information

Markov Chain Monte Carlo, Numerical Integration

Markov Chain Monte Carlo, Numerical Integration Markov Chain Monte Carlo, Numerical Integration (See Statistics) Trevor Gallen Fall 2015 1 / 1 Agenda Numerical Integration: MCMC methods Estimating Markov Chains Estimating latent variables 2 / 1 Numerical

More information

Paul Karapanagiotidis ECO4060

Paul Karapanagiotidis ECO4060 Paul Karapanagiotidis ECO4060 The way forward 1) Motivate why Markov-Chain Monte Carlo (MCMC) is useful for econometric modeling 2) Introduce Markov-Chain Monte Carlo (MCMC) - Metropolis-Hastings (MH)

More information

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers Engineering Part IIB: Module 4F0 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers Phil Woodland: pcw@eng.cam.ac.uk Michaelmas 202 Engineering Part IIB:

More information

The First Derivative Test

The First Derivative Test The First Derivative Test We have already looked at this test in the last section even though we did not put a name to the process we were using. We use a y number line to test the sign of the first derivative

More information

Monte Carlo Markov Chains: A Brief Introduction and Implementation. Jennifer Helsby Astro 321

Monte Carlo Markov Chains: A Brief Introduction and Implementation. Jennifer Helsby Astro 321 Monte Carlo Markov Chains: A Brief Introduction and Implementation Jennifer Helsby Astro 321 What are MCMC: Markov Chain Monte Carlo Methods? Set of algorithms that generate posterior distributions by

More information

Markov chain Monte Carlo methods in atmospheric remote sensing

Markov chain Monte Carlo methods in atmospheric remote sensing 1 / 45 Markov chain Monte Carlo methods in atmospheric remote sensing Johanna Tamminen johanna.tamminen@fmi.fi ESA Summer School on Earth System Monitoring and Modeling July 3 Aug 11, 212, Frascati July,

More information

LOCAL SEARCH. Today. Reading AIMA Chapter , Goals Local search algorithms. Introduce adversarial search 1/31/14

LOCAL SEARCH. Today. Reading AIMA Chapter , Goals Local search algorithms. Introduce adversarial search 1/31/14 LOCAL SEARCH Today Reading AIMA Chapter 4.1-4.2, 5.1-5.2 Goals Local search algorithms n hill-climbing search n simulated annealing n local beam search n genetic algorithms n gradient descent and Newton-Rhapson

More information

9 Markov chain Monte Carlo integration. MCMC

9 Markov chain Monte Carlo integration. MCMC 9 Markov chain Monte Carlo integration. MCMC Markov chain Monte Carlo integration, or MCMC, is a term used to cover a broad range of methods for numerically computing probabilities, or for optimization.

More information

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham MCMC 2: Lecture 3 SIR models - more topics Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham Contents 1. What can be estimated? 2. Reparameterisation 3. Marginalisation

More information

1 Probabilities. 1.1 Basics 1 PROBABILITIES

1 Probabilities. 1.1 Basics 1 PROBABILITIES 1 PROBABILITIES 1 Probabilities Probability is a tricky word usually meaning the likelyhood of something occuring or how frequent something is. Obviously, if something happens frequently, then its probability

More information