Cross Entropy. Owen Jones, Cardiff University. Cross Entropy. CE for rare events. CE for disc opt. TSP example. A new setting. Necessary.
|
|
- Shanna Ray
- 5 years ago
- Views:
Transcription
1 Owen Jones, Cardiff University
2 The Cross-Entropy (CE) method started as an adaptive importance sampling scheme for estimating rare event probabilities (Rubinstein, 1997), before being successfully applied to a variety of combinatorial optimisation problems (Rubinstein, 1999, 2001). It is a model based stochastic search technique and requires a parameterised sampling distribution.
3 Let X be a r.v. on some space X, and let f ( ; v) be a family of pdf s on X, indexed by v. We suppose that we are interested in estimating, for some given S, γ, u, l := P u (S(X) γ). Using an importance sampling density g we have the estimate ˆl = 1 N f (X i ; u) I N {S(Xi ) γ} g(x i ) i=1 where X 1,..., X N is an i.i.d. sample from g.
4 The ideal but unachievable choice for g is g f (x; u) (x) := I {S(x) γ}. l The idea behind the CE method is to choose g from the family f ( ; v) so that it is as close as possible to g, according to the Kullback-Leibler distance. That is (after a little work), choose v to maximise D(v) = E u I {S(X) γ} log f (X; v). In information theory the Kullback-Leibler distance is known as the cross-entropy.
5 We can estimate D(v) using importance sampling! For any w we can form ˆD(v) = 1 N N i=1 I {S(Xi ) γ} log f (X i ; v) f (X i; u) f (X i ; w), where X 1,..., X N is an i.i.d. sample from f ( ; w). In many cases of interest we can maximise ˆD analytically, to get an estimate of the optimal v. We would like to choose w to reduce the variance of ˆD, which is best achieved if P w (S(X) γ) is large. We achieve this using an iterative approach...
6 The algorithm We fix a parameter ρ (0, 1) 1. Put v 0 = u and t = Sample X 1,..., X N from f ( ; v t 1 ). Let γ t be the 1 ρ quantile of the ordered sample S(X 1 ),..., S(X N ). If γ t > γ put γ t = γ. 3. Maximise the modified estimate of D using γ t (instead of γ): v t = arg max v 1 N N i=1 I {S(Xi ) γ t} log f (X i ; v) f (X i; u) f (X i ; v t 1 ). 4. If γ t < γ put t t + 1 and go back to 2. Use the final v t to generate an importance sample with which to estimate P u (S(X) γ).
7 CE for discrete optimisation Suppose now that we wish to find the maximum of some function S over a (discrete) set X. Let γ be the maximum, obtained at x say. To apply the CE method we need to convert this optimisation problem into an estimation problem. Let f ( ; v) be a family of pmf s on X then, for (almost) any u, finding γ is equivalent to estimating P u (S(X) γ ) = x I {S(x) γ }f (x; u). Applying the CE method to this problem we get...
8 CE for discrete optimisation Not quite the algorithm We fix a parameter ρ (0, 1) 1. Put v 0 = u and t = Sample X 1,..., X N from f ( ; v t 1 ). Let γ t be the 1 ρ quantile of the ordered sample S(X 1 ),..., S(X N ). 3. Maximise the modified estimate of D using γ t : v t = arg max v 1 N N i=1 I {S(Xi ) γ t} log f (X i ; v) f (X i; u) f (X i ; v t 1 ). 4. If γ t hasn t changed for a while then stop, o/w put t t + 1 and go back to 2.
9 CE for discrete optimisation We can improve/simplify the algorithm in two ways. Firstly, given that u is arbitrary, we replace u by v t 1 in step 3, so that the term f (X i ; u)/f (X i ; v t 1 ) drops out. Secondly we use smoothed updating, to avoid assigning zero probability to points in the sample space.
10 CE for discrete optimisation The algorithm We fix parameters ρ, α (0, 1) 1. Put v 0 = u and t = Sample X 1,..., X N from f ( ; v t 1 ). Let γ t be the 1 ρ quantile of the ordered sample S(X 1 ),..., S(X N ). 3. Maximise the modified estimate of D using γ t : w t = arg max w 1 N N I {S(Xi ) γ t} log f (X i ; w). i=1 4. Put v t = αw t + (1 α)v t 1 5. If γ t hasn t changed for a while then stop, o/w put t t + 1 and go back to 2.
11 Choosing parameters There are a number of parameters to choose The family of distributions f ( ; v) often serve to provide a continuous interpolation of the discrete space X. The sample size N should be the same order as the dimension of v. ρ = 0.01 works well ;) Small α will slow convergence, but increase the chance of finding the optimum (see later).
12 Consider a complete graph on n nodes with edge weights d ij (distances). The Travelling Salesman Problem (TSP) is to find that permutation σ which minimises n 1 S(σ) := d σ(n),σ(1) + d σ(i),σ(i+1). We take as our state space X the set of permutations of {1,..., n}, and we ascribe a probability to x X by treating it as the sample path of a Markov chain. That is, our family of probability measures is indexed by n n transition matrices P, and i=1 n 1 f (x; P) P(x(n), x(1)) P(x(i), x(i + 1)). i=1
13 Given a sample X 1,..., X N from X, step 3 of the CE algorithm requires us to find P that maximises N I {S(Xi ) γ t} log f (X i ; P). i=1 Note that P is constrained to the space of transition matrices. Note also that we now have S(X i ) γ t. It is easily checked that the solution is given by ˆP(i, j) = #{X k : S(X k ) γ t and X k includes a transition from i to j} #{X k : S(X k ) γ t }
14 n <- 20 # number of cities N <- 5*n^2 # sample size, O(num params) rho < # for culling al <- 0.5 # smoothing param, in (0, 1) cities <- gen_cities(n) samples <- matrix(nrow=n, ncol=n) lengths <- rep(na, n) P <- matrix(1/(n-1), n, n); diag(p) <- 0 n_reps <- 20 for (rep in 1:n_reps) { # generate sample for (i in 1:N) { samples[,i] <- gen_tour(p) lengths[i] <- len_tour(cities$d, samples[,i]) } # identify top performers gamma <- sort(lengths)[ceiling(rho*n)] idx <- which(lengths <= gamma) # update P sp <- samplep(samples[,idx]) P <- (1 - al)*p + al*sp }
15 Resetting the setting In order to say something about the convergence of the CE method, we consider the following setting: X = {0, 1} n S has a unique maximum on X. For p = (p 1,..., p n ) (0, 1) n, and x = (x 1,..., x n ) X, we put f (x; p) = n i=1 p x i i (1 p i) 1 x i. That is, for X f ( ; p), the components X i are independent Bernoulli r.v.s with parameters p i. The smoothing parameter α t is allowed to vary. With this choice of f ( ; p) the CE algorithm becomes...
16 CE algorithm For given p 0, N, ρ, and {α t } 1. Put t = Sample X 1,..., X N from f ( ; p t 1 ). Let γ t be the 1 ρ quantile of the ordered sample S(X 1 ),..., S(X N ). 3. For w t = (w t,1,..., w t,n ) put w t,i = #{X k : S(X k ) γ t and X k,i = 1} #{X k : S(X k ) γ t } 4. Put p t = α t w t + (1 α t )p t 1 5. If γ t hasn t changed for a while then stop, o/w put t t + 1 and go back to 2.
17 condition for optimality The CE algorithm generates the optimum solution with probability 1 only if t=1 m=1 t (1 α m ) = Proof We can show that the probability the first component is never correct is at least ( ( t 1 N 1 φ 1,1 (1 α m ))), t=1 m=1 for some φ 1,1 (0, 1). This term is zero only if the stated condition holds.
18 condition for optimality The CE algorithm generates the optimum solution with probability 1 if t=1 m=1 t (1 α m ) n = Proof We can show that the probability that we never generate the optimum solution is bounded above by c ( ) t 1 N 1 φ 1 (1 α m ) n, t=2 m=1 for some c, φ 1 (0, 1). This term is zero only if the stated condition holds. Remark The condition holds if t=1 α t <.
19 Note on proofs Both ultimately rely on upper and lower bounds for p t,i p 0,i m=1 t (1 α m ) p t,i 1 (1 p 0,i ) t (1 α m ). m=1 Clearly if t α t < then p t,i is bounded from 0 and 1. That is, a necessary condition for p t to converge to a unit mass at some x is that t α t =.
20 CE with constant α If α t is constant, then p t converges. Proof In much the same way that you find upper and lower bounds on p t,i, you can show that P(w t,i = 1 t > k F k ) g(p k,i ) P(w t,i = 0 t > k F k ) g(1 p k,i ) where g(u) = t=0 (1 (1 u)(1 α)t ) N. Moreover, we can find 0 < a < b < 1 such that every time p t,i changes sign it must lie in (a, b). Thus, every time p t,i changes sign, there is a non-zero probability that either w t,i = 1 from then on, or w t,i = 0 from then on. This results in either p t,i increasing or decreasing monotonically (to some limit).
21 CE with constant α Moreover, p t converges to a unit mass at some x. Proof If p t,i converges to some p i (0, 1) then so must w t,i. But we know that w t,i is eventually either 0 or 1, a contradiction.
22 CE with constant α The probability that the optimal solution is generated increases to 1 as α 0. Proof Previously we obtained an upper bound on the probability that the optimal solution is never generated. We can show that for constant α, this bound decreases to 0 as α 0.
23 We applied the CE algorithm to a max-cut problem with 8 vertices, using different values of α. For each α we applied the algorithm 100 times to get a Monte-Carlo estimate of the probability that the optimal solution is generated. In the following diagram the x-axis is the number of iterations and the y-axis the proportion of runs in which the optimal solution had been generated.
24 1 0.9 α t =1/nt α=0.1 y α t =1/nt α=0.5 α=0.8 α = α = x
The Cross Entropy Method for the N-Persons Iterated Prisoner s Dilemma
The Cross Entropy Method for the N-Persons Iterated Prisoner s Dilemma Tzai-Der Wang Artificial Intelligence Economic Research Centre, National Chengchi University, Taipei, Taiwan. email: dougwang@nccu.edu.tw
More informationA Tutorial on the Cross-Entropy Method
A Tutorial on the Cross-Entropy Method Pieter-Tjerk de Boer Electrical Engineering, Mathematics and Computer Science department University of Twente ptdeboer@cs.utwente.nl Dirk P. Kroese Department of
More informationRare Event Simulation using Importance Sampling and Cross-Entropy p. 1
Rare Event Simulation using Importance Sampling and Cross-Entropy Caitlin James Supervisor: Phil Pollett Rare Event Simulation using Importance Sampling and Cross-Entropy p. 1 500 Simulation of a population
More informationPLEASE SCROLL DOWN FOR ARTICLE. Full terms and conditions of use:
This article was downloaded by: [Stanford University] On: 20 July 2010 Access details: Access Details: [subscription number 917395611] Publisher Taylor & Francis Informa Ltd Registered in England and Wales
More informationImproving Search Space Exploration and Exploitation with the Cross-Entropy Method and the Evolutionary Particle Swarm Optimization
1 Improving Search Space Exploration and Exploitation with the Cross-Entropy Method and the Evolutionary Particle Swarm Optimization Leonel Carvalho, Vladimiro Miranda, Armando Leite da Silva, Carolina
More information7.1 Basis for Boltzmann machine. 7. Boltzmann machines
7. Boltzmann machines this section we will become acquainted with classical Boltzmann machines which can be seen obsolete being rarely applied in neurocomputing. It is interesting, after all, because is
More informationGeneralized Cross-Entropy Methods
Generalized Cross-Entropy Methods Z. I. Botev D. P. Kroese T. Taimre Abstract The cross-entropy and minimum cross-entropy methods are well-known Monte Carlo simulation techniques for rare-event probability
More informationWhy Water Freezes (A probabilistic view of phase transitions)
Why Water Freezes (A probabilistic view of phase transitions) Mark Huber Dept. of Mathematics and Institute of Statistics and Decision Sciences Duke University mhuber@math.duke.edu www.math.duke.edu/~mhuber
More informationThe Cross Entropy Method and Its Applications
The Cross Entropy Method and Its Applications Sarwar Hamad Submitted to the Institute of Graduate Studies and Research in partial fulfillment of the requirements for the degree of Master of Science in
More informationx 2 i 10 cos(2πx i ). i=1
CHAPTER 2 Optimization Written Exercises 2.1 Consider the problem min, where = 4 + 4 x 2 i 1 cos(2πx i ). i=1 Note that is the Rastrigin function see Section C.1.11. a) What are the independent variables
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov
More informationMonte Carlo Methods. Leon Gu CSD, CMU
Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte
More informationL09. PARTICLE FILTERING. NA568 Mobile Robotics: Methods & Algorithms
L09. PARTICLE FILTERING NA568 Mobile Robotics: Methods & Algorithms Particle Filters Different approach to state estimation Instead of parametric description of state (and uncertainty), use a set of state
More informationInvestigation into the use of confidence indicators with calibration
WORKSHOP ON FRONTIERS IN BENCHMARKING TECHNIQUES AND THEIR APPLICATION TO OFFICIAL STATISTICS 7 8 APRIL 2005 Investigation into the use of confidence indicators with calibration Gerard Keogh and Dave Jennings
More informationThree examples of a Practical Exact Markov Chain Sampling
Three examples of a Practical Exact Markov Chain Sampling Zdravko Botev November 2007 Abstract We present three examples of exact sampling from complex multidimensional densities using Markov Chain theory
More informationThe Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision
The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that
More informationExpectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Expectation DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean, variance,
More informationSingle Machine Models
Outline DM87 SCHEDULING, TIMETABLING AND ROUTING Lecture 8 Single Machine Models 1. Dispatching Rules 2. Single Machine Models Marco Chiarandini DM87 Scheduling, Timetabling and Routing 2 Outline Dispatching
More informationSemi-Parametric Importance Sampling for Rare-event probability Estimation
Semi-Parametric Importance Sampling for Rare-event probability Estimation Z. I. Botev and P. L Ecuyer IMACS Seminar 2011 Borovets, Bulgaria Semi-Parametric Importance Sampling for Rare-event probability
More informationExpectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda
Expectation DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean,
More informationVariational Methods in Bayesian Deconvolution
PHYSTAT, SLAC, Stanford, California, September 8-, Variational Methods in Bayesian Deconvolution K. Zarb Adami Cavendish Laboratory, University of Cambridge, UK This paper gives an introduction to the
More informationReinforcement Learning
1 Reinforcement Learning Chris Watkins Department of Computer Science Royal Holloway, University of London July 27, 2015 2 Plan 1 Why reinforcement learning? Where does this theory come from? Markov decision
More informationWeb Structure Mining Nodes, Links and Influence
Web Structure Mining Nodes, Links and Influence 1 Outline 1. Importance of nodes 1. Centrality 2. Prestige 3. Page Rank 4. Hubs and Authority 5. Metrics comparison 2. Link analysis 3. Influence model 1.
More informationBLIND SEPARATION OF TEMPORALLY CORRELATED SOURCES USING A QUASI MAXIMUM LIKELIHOOD APPROACH
BLID SEPARATIO OF TEMPORALLY CORRELATED SOURCES USIG A QUASI MAXIMUM LIKELIHOOD APPROACH Shahram HOSSEII, Christian JUTTE Laboratoire des Images et des Signaux (LIS, Avenue Félix Viallet Grenoble, France.
More informationSIMU L TED ATED ANNEA L NG ING
SIMULATED ANNEALING Fundamental Concept Motivation by an analogy to the statistical mechanics of annealing in solids. => to coerce a solid (i.e., in a poor, unordered state) into a low energy thermodynamic
More informationInformation geometry for bivariate distribution control
Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic
More informationEstimating the Number of s-t Paths in a Graph
Journal of Graph Algorithms and Applications http://jgaa.info/ vol. 11, no. 1, pp. 195 214 (2007) Estimating the Number of s-t Paths in a Graph Ben Roberts Department of Pure Mathematics and Mathematical
More informationMarkov Decision Processes
Markov Decision Processes Lecture notes for the course Games on Graphs B. Srivathsan Chennai Mathematical Institute, India 1 Markov Chains We will define Markov chains in a manner that will be useful to
More informationEntropy Enhanced Covariance Matrix Adaptation Evolution Strategy (EE_CMAES)
1 Entropy Enhanced Covariance Matrix Adaptation Evolution Strategy (EE_CMAES) Developers: Main Author: Kartik Pandya, Dept. of Electrical Engg., CSPIT, CHARUSAT, Changa, India Co-Author: Jigar Sarda, Dept.
More informationMarkov Chains, Random Walks on Graphs, and the Laplacian
Markov Chains, Random Walks on Graphs, and the Laplacian CMPSCI 791BB: Advanced ML Sridhar Mahadevan Random Walks! There is significant interest in the problem of random walks! Markov chain analysis! Computer
More informationChapter 11. Stochastic Methods Rooted in Statistical Mechanics
Chapter 11. Stochastic Methods Rooted in Statistical Mechanics Neural Networks and Learning Machines (Haykin) Lecture Notes on Self-learning Neural Algorithms Byoung-Tak Zhang School of Computer Science
More informationHamiltonian Cycle Problem and Cross Entropy. Passing Through the Nodes by Learning
: Passing Through the Nodes by Learning Centre for Industrial and Applied Mathematics (CIAM) University of South Australia, Mawson Lakes SA 5095 Australia A Conference on the Occasion of R.Y. Rubinstein
More informationParameter Estimation for ODEs using a Cross-Entropy Approach. Bo Wang
Parameter Estimation for ODEs using a Cross-Entropy Approach by Bo Wang A thesis submitted in conformity with the requirements for the degree of Master of Science Graduate Department of Computer Science
More informationOptimisation and Operations Research
Optimisation and Operations Research Lecture 15: The Greedy Heuristic Matthew Roughan http://www.maths.adelaide.edu.au/matthew.roughan/ Lecture_notes/OORII/ School of
More informationUNIVERSITY OF YORK. MSc Examinations 2004 MATHEMATICS Networks. Time Allowed: 3 hours.
UNIVERSITY OF YORK MSc Examinations 2004 MATHEMATICS Networks Time Allowed: 3 hours. Answer 4 questions. Standard calculators will be provided but should be unnecessary. 1 Turn over 2 continued on next
More informationA Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait
A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute
More informationMCMC Simulated Annealing Exercises.
Aula 10. Simulated Annealing. 0 MCMC Simulated Annealing Exercises. Anatoli Iambartsev IME-USP Aula 10. Simulated Annealing. 1 [Wiki] Salesman problem. The travelling salesman problem (TSP), or, in recent
More informationThe quest for finding Hamiltonian cycles
The quest for finding Hamiltonian cycles Giang Nguyen School of Mathematical Sciences University of Adelaide Travelling Salesman Problem Given a list of cities and distances between cities, what is the
More informationChapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued
Chapter 3 sections Chapter 3 - continued 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions
More informationBayesian Gaussian Process Regression
Bayesian Gaussian Process Regression STAT8810, Fall 2017 M.T. Pratola October 7, 2017 Today Bayesian Gaussian Process Regression Bayesian GP Regression Recall we had observations from our expensive simulator,
More information1 Computing the Invariant Distribution
LECTURE NOTES ECO 613/614 FALL 2007 KAREN A. KOPECKY 1 Computing the Invariant Distribution Suppose an individual s state consists of his current assets holdings, a [a, ā] and his current productivity
More informationLecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes
Lecture Notes 7 Random Processes Definition IID Processes Bernoulli Process Binomial Counting Process Interarrival Time Process Markov Processes Markov Chains Classification of States Steady State Probabilities
More informationApplication of new Monte Carlo method for inversion of prestack seismic data. Yang Xue Advisor: Dr. Mrinal K. Sen
Application of new Monte Carlo method for inversion of prestack seismic data Yang Xue Advisor: Dr. Mrinal K. Sen Overview Motivation Introduction Bayes theorem Stochastic inference methods Methodology
More informationApplication of the Cross-Entropy Method to Clustering and Vector Quantization
Application of the Cross-Entropy Method to Clustering and Vector Quantization Dirk P. Kroese and Reuven Y. Rubinstein and Thomas Taimre Faculty of Industrial Engineering and Management, Technion, Haifa,
More informationLecture 4: State Estimation in Hidden Markov Models (cont.)
EE378A Statistical Signal Processing Lecture 4-04/13/2017 Lecture 4: State Estimation in Hidden Markov Models (cont.) Lecturer: Tsachy Weissman Scribe: David Wugofski In this lecture we build on previous
More informationAdaptive Monte Carlo methods
Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert
More informationComputer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo
Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative
More informationStat 451 Lecture Notes Numerical Integration
Stat 451 Lecture Notes 03 12 Numerical Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 5 in Givens & Hoeting, and Chapters 4 & 18 of Lange 2 Updated: February 11, 2016 1 / 29
More informationA Model Reference Adaptive Search Method for Global Optimization
A Model Reference Adaptive Search Method for Global Optimization Jiaqiao Hu Department of Electrical and Computer Engineering & Institute for Systems Research, University of Maryland, College Park, MD
More informationA Cross Entropy Based Multi-Agent Approach to Traffic Assignment Problems
A Cross Entropy Based Multi-Agent Approach to Traffic Assignment Problems Tai-Yu Ma and Jean-Patrick Lebacque INRETS/GRETIA - 2, Avenue du General-Malleret Joinville, F-94114 Arcueil, France tai-yu.ma@inrets.fr,
More information1 Acceptance-Rejection Method
Copyright c 2016 by Karl Sigman 1 Acceptance-Rejection Method As we already know, finding an explicit formula for F 1 (y), y [0, 1], for the cdf of a rv X we wish to generate, F (x) = P (X x), x R, is
More informationContinuous Optimisation, Chpt 6: Solution methods for Constrained Optimisation
Continuous Optimisation, Chpt 6: Solution methods for Constrained Optimisation Peter J.C. Dickinson DMMP, University of Twente p.j.c.dickinson@utwente.nl http://dickinson.website/teaching/2017co.html version:
More informationConvergence of Random Processes
Convergence of Random Processes DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Define convergence for random
More informationCDA5530: Performance Models of Computers and Networks. Chapter 3: Review of Practical
CDA5530: Performance Models of Computers and Networks Chapter 3: Review of Practical Stochastic Processes Definition Stochastic ti process X = {X(t), t T} is a collection of random variables (rvs); one
More informationF denotes cumulative density. denotes probability density function; (.)
BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models
More informationBayesian Modeling of Conditional Distributions
Bayesian Modeling of Conditional Distributions John Geweke University of Iowa Indiana University Department of Economics February 27, 2007 Outline Motivation Model description Methods of inference Earnings
More informationCDA6530: Performance Models of Computers and Networks. Chapter 3: Review of Practical Stochastic Processes
CDA6530: Performance Models of Computers and Networks Chapter 3: Review of Practical Stochastic Processes Definition Stochastic process X = {X(t), t2 T} is a collection of random variables (rvs); one rv
More informationDevelopment of Stochastic Artificial Neural Networks for Hydrological Prediction
Development of Stochastic Artificial Neural Networks for Hydrological Prediction G. B. Kingston, M. F. Lambert and H. R. Maier Centre for Applied Modelling in Water Engineering, School of Civil and Environmental
More informationBayesian Methods with Monte Carlo Markov Chains II
Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3
More informationMonte Carlo Methods. Handbook of. University ofqueensland. Thomas Taimre. Zdravko I. Botev. Dirk P. Kroese. Universite de Montreal
Handbook of Monte Carlo Methods Dirk P. Kroese University ofqueensland Thomas Taimre University ofqueensland Zdravko I. Botev Universite de Montreal A JOHN WILEY & SONS, INC., PUBLICATION Preface Acknowledgments
More informationECE 302 Division 1 MWF 10:30-11:20 (Prof. Pollak) Final Exam Solutions, 5/3/2004. Please read the instructions carefully before proceeding.
NAME: ECE 302 Division MWF 0:30-:20 (Prof. Pollak) Final Exam Solutions, 5/3/2004. Please read the instructions carefully before proceeding. If you are not in Prof. Pollak s section, you may not take this
More informationCheng Soon Ong & Christian Walder. Canberra February June 2017
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2017 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 679 Part XIX
More informationLecture 4 : Introduction to Low-density Parity-check Codes
Lecture 4 : Introduction to Low-density Parity-check Codes LDPC codes are a class of linear block codes with implementable decoders, which provide near-capacity performance. History: 1. LDPC codes were
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos Contents Markov Chain Monte Carlo Methods Sampling Rejection Importance Hastings-Metropolis Gibbs Markov Chains
More informationNew Global Optimization Algorithms for Model-Based Clustering
New Global Optimization Algorithms for Model-Based Clustering Jeffrey W. Heath Department of Mathematics University of Maryland, College Park, MD 7, jheath@math.umd.edu Michael C. Fu Robert H. Smith School
More informationFinite-Horizon Statistics for Markov chains
Analyzing FSDT Markov chains Friday, September 30, 2011 2:03 PM Simulating FSDT Markov chains, as we have said is very straightforward, either by using probability transition matrix or stochastic update
More informationGPU Applications for Modern Large Scale Asset Management
GPU Applications for Modern Large Scale Asset Management GTC 2014 San José, California Dr. Daniel Egloff QuantAlea & IncubeAdvisory March 27, 2014 Outline Portfolio Construction Outline Portfolio Construction
More informationRobustesse des techniques de Monte Carlo dans l analyse d événements rares
Institut National de Recherche en Informatique et Automatique Institut de Recherche en Informatique et Systèmes Aléatoires Robustesse des techniques de Monte Carlo dans l analyse d événements rares H.
More informationComputational statistics
Computational statistics Combinatorial optimization Thierry Denœux February 2017 Thierry Denœux Computational statistics February 2017 1 / 37 Combinatorial optimization Assume we seek the maximum of f
More informationSimulation. Where real stuff starts
1 Simulation Where real stuff starts ToC 1. What is a simulation? 2. Accuracy of output 3. Random Number Generators 4. How to sample 5. Monte Carlo 6. Bootstrap 2 1. What is a simulation? 3 What is a simulation?
More informationSolving Markov Decision Processes by Simulation: a Model-Based Approach
Universität Ulm Fakultät für Mathematik und Wirtschaftswissenschaften Solving Markov Decision Processes by Simulation: a Model-Based Approach Diplomarbeit in Wirtschaftsmathematik vorgelegt von Kristina
More informationContinuous-time Markov Chains
Continuous-time Markov Chains Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ October 23, 2017
More informationCPSC 531: System Modeling and Simulation. Carey Williamson Department of Computer Science University of Calgary Fall 2017
CPSC 531: System Modeling and Simulation Carey Williamson Department of Computer Science University of Calgary Fall 2017 Quote of the Day A person with one watch knows what time it is. A person with two
More informationVariational sampling approaches to word confusability
Variational sampling approaches to word confusability John R. Hershey, Peder A. Olsen and Ramesh A. Gopinath {jrhershe,pederao,rameshg}@us.ibm.com IBM, T. J. Watson Research Center Information Theory and
More informationAnt Colony Optimization: an introduction. Daniel Chivilikhin
Ant Colony Optimization: an introduction Daniel Chivilikhin 03.04.2013 Outline 1. Biological inspiration of ACO 2. Solving NP-hard combinatorial problems 3. The ACO metaheuristic 4. ACO for the Traveling
More informationReview. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with
More informationA Probability Review
A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in
More informationDS-GA 1002 Lecture notes 11 Fall Bayesian statistics
DS-GA 100 Lecture notes 11 Fall 016 Bayesian statistics In the frequentist paradigm we model the data as realizations from a distribution that depends on deterministic parameters. In contrast, in Bayesian
More informationStat 516, Homework 1
Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball
More informationOptimal path planning using Cross-Entropy method
Optimal path planning using Cross-Entropy method F Celeste, FDambreville CEP/Dept of Geomatics Imagery Perception 9 Arcueil France {francisceleste, fredericdambreville}@etcafr J-P Le Cadre IRISA/CNRS Campus
More informationSAT, Coloring, Hamiltonian Cycle, TSP
1 SAT, Coloring, Hamiltonian Cycle, TSP Slides by Carl Kingsford Apr. 28, 2014 Sects. 8.2, 8.7, 8.5 2 Boolean Formulas Boolean Formulas: Variables: x 1, x 2, x 3 (can be either true or false) Terms: t
More information5 Mutual Information and Channel Capacity
5 Mutual Information and Channel Capacity In Section 2, we have seen the use of a quantity called entropy to measure the amount of randomness in a random variable. In this section, we introduce several
More informationChapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued
Chapter 3 sections 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions 3.6 Conditional
More informationAlgorithms and Complexity theory
Algorithms and Complexity theory Thibaut Barthelemy Some slides kindly provided by Fabien Tricoire University of Vienna WS 2014 Outline 1 Algorithms Overview How to write an algorithm 2 Complexity theory
More information15-889e Policy Search: Gradient Methods Emma Brunskill. All slides from David Silver (with EB adding minor modificafons), unless otherwise noted
15-889e Policy Search: Gradient Methods Emma Brunskill All slides from David Silver (with EB adding minor modificafons), unless otherwise noted Outline 1 Introduction 2 Finite Difference Policy Gradient
More informationMean-field dual of cooperative reproduction
The mean-field dual of systems with cooperative reproduction joint with Tibor Mach (Prague) A. Sturm (Göttingen) Friday, July 6th, 2018 Poisson construction of Markov processes Let (X t ) t 0 be a continuous-time
More informationOn construction of constrained optimum designs
On construction of constrained optimum designs Institute of Control and Computation Engineering University of Zielona Góra, Poland DEMA2008, Cambridge, 15 August 2008 Numerical algorithms to construct
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationA Bayesian methodology for systemic risk assessment in financial networks
A Bayesian methodology for systemic risk assessment in financial networks Axel Gandy Imperial College London 9/12/2015 Joint work with Luitgard Veraart (London School of Economics) LUH-Kolloquium Versicherungs-
More informationSimulated Annealing. Local Search. Cost function. Solution space
Simulated Annealing Hill climbing Simulated Annealing Local Search Cost function? Solution space Annealing Annealing is a thermal process for obtaining low energy states of a solid in a heat bath. The
More informationOptimization Prof. A. Goswami Department of Mathematics Indian Institute of Technology, Kharagpur. Lecture - 20 Travelling Salesman Problem
Optimization Prof. A. Goswami Department of Mathematics Indian Institute of Technology, Kharagpur Lecture - 20 Travelling Salesman Problem Today we are going to discuss the travelling salesman problem.
More informationPart I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS
Part I C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Probabilistic Graphical Models Graphical representation of a probabilistic model Each variable corresponds to a
More informationOptimal Sampling and Problematic Likelihood Functions in a Simple Population Model
Optimal Sampling and Problematic Likelihood Functions in a Simple Population Model 1 Pagendam, D.E. and 1 Pollett, P.K. 1 Department of Mathematics, School of Physical Sciences University of Queensland,
More informationMarkov Chain Monte Carlo Methods
Markov Chain Monte Carlo Methods Sargur Srihari srihari@cedar.buffalo.edu 1 Topics Limitations of Likelihood Weighting Gibbs Sampling Algorithm Markov Chains Gibbs Sampling Revisited A broader class of
More informationStatistical Inference: Estimation and Confidence Intervals Hypothesis Testing
Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire
More informationProbability Distributions Columns (a) through (d)
Discrete Probability Distributions Columns (a) through (d) Probability Mass Distribution Description Notes Notation or Density Function --------------------(PMF or PDF)-------------------- (a) (b) (c)
More informationQUANTIZATION FOR DISTRIBUTED ESTIMATION IN LARGE SCALE SENSOR NETWORKS
QUANTIZATION FOR DISTRIBUTED ESTIMATION IN LARGE SCALE SENSOR NETWORKS Parvathinathan Venkitasubramaniam, Gökhan Mergen, Lang Tong and Ananthram Swami ABSTRACT We study the problem of quantization for
More informationDistance Metrics and Fitness Distance Analysis for the Capacitated Vehicle Routing Problem
MIC2005. The 6th Metaheuristics International Conference 603 Metrics and Analysis for the Capacitated Vehicle Routing Problem Marek Kubiak Institute of Computing Science, Poznan University of Technology
More informationCombining Interval and Probabilistic Uncertainty in Engineering Applications
Combining Interval and Probabilistic Uncertainty in Engineering Applications Andrew Pownuk Computational Science Program University of Texas at El Paso El Paso, Texas 79968, USA ampownuk@utep.edu Page
More informationON THE DISTRIBUTION OF RESIDUALS IN FITTED PARAMETRIC MODELS. C. P. Quesenberry and Charles Quesenberry, Jr.
.- ON THE DISTRIBUTION OF RESIDUALS IN FITTED PARAMETRIC MODELS C. P. Quesenberry and Charles Quesenberry, Jr. Results of a simulation study of the fit of data to an estimated parametric model are reported.
More information