Robust Monte Carlo Methods for Sequential Planning and Decision Making
|
|
- Marcia Lawson
- 5 years ago
- Views:
Transcription
1 Robust Monte Carlo Methods for Sequential Planning and Decision Making Sue Zheng, Jason Pacheco, & John Fisher Sensing, Learning, & Inference Group Computer Science & Artificial Intelligence Laboratory Massachusetts Institute of Technology November 30, 2017
2 Diversion Detection Potential diversion points [Image: Given sensor characteristics, observations, and ideal network, infer deviations (e.g., unknown network of material diversion).
3 Active Sensing and Decision Making Announced site visit Satellite/Flyby EO/SAR/IR [Image: Announce site inspections to learn causal network structure. Modify sensor configurations to reduce uncertainty of inferences.
4 Sequential Experiment Design Experimental choice is a form of planning. [Image: Liepe et al., 2013]
5 Goal Develop algorithm for sequential Bayesian inference and planning with the following desiderata, Information theoretic approach to planning Effective even for complex models Provable theoretical guarantees Maximally reuses computation in inference and planning phases
6 Probabilistic Model & Inference [Image: } {{ } Unknown x } {{ } Observation y Joint probability model: p(x, y) = p(x)p(y x) Prior belief in diversion network Likelihood of observations given structure Posterior belief in network structure given observations: p(x y) = p(x)p(y x) p(y)
7 Decision-Driven Observations Unknown Decision Observation d = 1 Y X p(y X; d = 1) X d = 2 Y X p(y X; d = 2).. d = D Y X p(y X; d = D) Configuration variable d = {1,..., D} controls observation model. E.g., announcing inspection of site A affects sensor observations. Choose configuration to reduce posterior uncertainty
8 Entropy H(X) = E[ log p(x)] Encodes uncertainty, more reliable than variance in many cases (multimodality) Coin flip example: X Bernoulli(p)
9 Mutual Information What would be uncertainty of X if we knew Y? I(X; Y ) = H(X) H(X Y ) Prior uncertainty Expected posterior uncertainty I will drop explicit dependence on configuration variable d when clear from context: I(X; Y d) I(X; Y )
10 Information Theoretic Planning d = arg max d I(X; Y d) Execute Plan y p( x; d) Execute: Draw new observation Y = y Inference: Update posterior belief p(x y) p(x)p(y X) Planning: Choose most informative decision d = arg max I(X; Y d) d
11 Information Theoretic Planning d = arg max d I(X; Y d) Execute Plan Inference y p( x; d) p(x y; d) Execute: Draw new observation Y = y Inference: Update posterior belief p(x y) p(x)p(y X) Planning: Choose most informative decision d = arg max I(X; Y d) d
12 Information Theoretic Planning d = arg max d I(X; Y d) Execute Plan Inference y p( x; d) p(x y; d) Planning Execute: Draw new observation Y = y Inference: Update posterior belief p(x y) p(x)p(y X) Planning: Choose most informative decision d = arg max I(X; Y d) d
13 Closed-Loop Greedy Planning X Unknown quantity: Observation sequence: Y 1 Y 2 Y 3 Y T Decision sequence: d 1 d 2 d 3 d T Condition on observed information during planning d greedy t = arg max I(X; Y t y1 t 1 ; d) d
14 Closed-Loop Greedy Planning X Unknown quantity: Observation sequence: Y 1 Y 2 Y 3 Y T Decision sequence: d 1 d 2 d 3 d T Condition on observed information during planning d greedy t = arg max I(X; Y t y1 t 1 ; d) d
15 Closed-Loop Greedy Planning X Unknown quantity: Observation sequence: y 1 Y 2 Y 3 Y T Decision sequence: d 1 d 2 d 3 d T Condition on observed information during planning d greedy t = arg max I(X; Y t y1 t 1 ; d) d
16 Closed-Loop Greedy Planning X Unknown quantity: Observation sequence: y 1 y 2 Y 3 Y T Decision sequence: d 1 d 2 d 3 d T Condition on observed information during planning d greedy t = arg max I(X; Y t y1 t 1 ; d) d
17 Closed-Loop Greedy Planning X Unknown quantity: Observation sequence: y 1 y 2 y 3 Y T Decision sequence: d 1 d 2 d 3 d T Condition on observed information during planning d greedy t = arg max I(X; Y t y1 t 1 ; d) d
18 Estimating Mutual Information Mutual information typically lacks closed-form: [ ] p(x, y) I(X; Y ) = E log p(x)p(y) Mutual Information Evidence Evidence integrates every latent configuration: p(y) = p(x, y)dx Can use Monte Carlo integration to estimate integrals from joint samples: {x i, y i } N i=1 p(x, y) Empirical estimate is sensitive to outliers in small sample regime
19 Robust Estimation of Information Absolute Error Robust upper Empirical upper/lower [Images: Catoni, 2010] ɛ (Confidence Level = 1 2ɛ) Robust M-estimator is solution to root equation i ψ(α(θ i ˆθ)) = 0 where θ i = log p(x i,y i ) p(x i )p(y i ) Influence function ψ reduces impact of outliers Provides quality guarantee in finite sample setting
20 Integrated Inference & Planning 3 MCMC Inference 1 2 Robust Evidence Estimate Planning Robust MI Estimation At time t execute plan and draw observation: y t p(y x; d greedy t ) Do posterior inference via MCMC samples {x i } N i=1 p(x y t 1)
21 Integrated Inference & Planning 3 MCMC Inference 1 2 Robust Evidence Estimate Planning Robust MI Estimation For each decision draw samples: {x i } N i=1 p(x Y t+1, y1; t d) Robust estimation of model evidence: ˆp(Y t+1 y1; t d) Can reuse MCMC samples with importance sampling
22 Integrated Inference & Planning 3 MCMC Inference 1 2 Robust Evidence Estimate Planning Robust MI Estimation For each decision robust MI estimate: Î(X; Y t+1 y1; t d) Greedy planning: d greedy t+1 = arg max Î(X; Y t+1 y1; t d) d
23 Sequential Inference & Planning MCMC Robust Evidence Estimate Robust MI Estimation Run MCMC only when necessary Avoid additional samples during planning Significantly reduces computation in inference and planning stages through sample reuse. Algorithmic details are similar to a particle filter...
24 Asymptotic Results Theoretical bounds on estimators ensure high quality decisions. N( Ĥ H) d N (0, ) Establish central limit theorem as number of samples N. Show that estimators are consistent and approximately Normal with variance Θ( 1 N ).
25 Finite Sample Bounds In practice, finite sample bounds are preferred over asymptotic results. H(Y ) + b const < ĤY < H(Y ) + b + const where b = KL(p(Y ) ˆp(Y )) w.p. 1 2ɛ Estimates are biased but deviation is bounded w.h.p. through use of robust estimator. Absolute Error Robust upper Empirical upper/lower [Image: Catoni, 2010] ɛ (Confidence Level = 1 2ɛ)
26 Diversions in Nuclear Fuel Cycle [Image: Identify sites for inspection announcement. Performing an intervention to learn causal network structure.
27 Causal Network Inference Nodes interact linearly according to directed acyclic graph (DAG) structure. 8 Interaction Weight: w Directed Acyclic Graph: G Node Observation: X
28 Causal Network Inference Graph structure and interaction weights unknown. 0 Interventions ?
29 Causal Network Inference Covariance Matrix Possible Graphs A B C A A B B C C Cannot determine causality from correlations, need to perform active interventions A C B A B = 0 Clamp node to fixed value. A B C C
30 Causal Network Inference [Image: Cho et al., 2016]
31 Causal Network Inference Model: x j x Pa(j), w j, G N (w j x Pa(j), σ 2 j ) w j G N ( ) G Uniform-DAG Observation G w Graph Parameters Planning: Previous t 1 observations: x(1) x(2) x(3) x(t ) X = {x(1),..., x(t 1)} Select intervention to maximize mutual information: d = arg max I(G; X X, d) d Clamp node x d = 0 and observe remaining nodes. d 1 d 2 d 3 d T
32 Causal Network Inference Robust planning selects most informative interventions in early iterations.
33 Causal Network Inference MSE Area Under PRC Area Under ROC Median (solid) best/worst (dashed) out of 50 runs More informative experiments than Random In early iterations chooses more informative experiments compared to Cho et al., 2016
34 1 Summary 8 Sequential Bayesian inference and planning applies to complex models (more interesting applications...) Theoretical guarantees on estimator quality for costly decisions. Scale up to larger problems and bound probability of incorrect selection.
35
36 Measurement Selection Sensor Types Satellite Flyby Earth-based Sensing Modes EO/SAR/IR Hyperspectral Radio Freq
37 Plan Execution Costs d = arg max I(X; Y d) λr(d) d Information Reward Information/Cost Tradeoff Cost Plan is feasible if information justifies cost: Possible costs for this application: Sensor costs Power consumption I(X; Y d) > λr(d)
38 Monte Carlo Integration Need to compute expected values of the form: E [f(x)] = p(x)f(x)dx Draw samples from the distribution: {x i } N i=1 p(x) Monte Carlo integration is empirical mean E [f(x)] 1 N N f(x i ) i=1 Good statistical properties as N, but problematic for finite samples
39 Asymptotic and Finite Sample Bounds Asymptotic Bounds: [ ]) d σ 2 N Ĥ Y N (H(Y ), E x (p(y X)) y Mp 2 + σ 2 (log p(y )) (Y ) d ( N Ĥ Y X N H(Y X), σ 2 (log p(y X)) ) Finite-Sample Bounds: Assuming N > 2(1 + log ɛ 1 ) H(Y ) + b c < ĤY < H(Y ) + b + c w.p. 1 2ɛ [ b = E log p(y ) ] 1 + log ɛ 1 σ 2 c = ˆp(Y ; X) 1 (1 + log ɛ 1 )/N 2N
40 Causal Network Inference Robust is more consistent in intervention selection. Average realized gain at t = 1 indicates intervention at node 6 is optimal.
Expectation Propagation Algorithm
Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,
More informationBayesian Networks BY: MOHAMAD ALSABBAGH
Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional
More informationProbabilistic Machine Learning
Probabilistic Machine Learning Bayesian Nets, MCMC, and more Marek Petrik 4/18/2017 Based on: P. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective. Chapter 10. Conditional Independence Independent
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationCS 630 Basic Probability and Information Theory. Tim Campbell
CS 630 Basic Probability and Information Theory Tim Campbell 21 January 2003 Probability Theory Probability Theory is the study of how best to predict outcomes of events. An experiment (or trial or event)
More informationMachine Learning Summer School
Machine Learning Summer School Lecture 3: Learning parameters and structure Zoubin Ghahramani zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Department of Engineering University of Cambridge,
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationDS-GA 1002 Lecture notes 11 Fall Bayesian statistics
DS-GA 100 Lecture notes 11 Fall 016 Bayesian statistics In the frequentist paradigm we model the data as realizations from a distribution that depends on deterministic parameters. In contrast, in Bayesian
More informationLecture 6: Graphical Models: Learning
Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)
More informationIntroduction to Bayesian Learning
Course Information Introduction Introduction to Bayesian Learning Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Apprendimento Automatico: Fondamenti - A.A. 2016/2017 Outline
More informationMachine Learning for Data Science (CS4786) Lecture 24
Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each
More informationProbabilistic numerics for deep learning
Presenter: Shijia Wang Department of Engineering Science, University of Oxford rning (RLSS) Summer School, Montreal 2017 Outline 1 Introduction Probabilistic Numerics 2 Components Probabilistic modeling
More informationLecture 2: From Linear Regression to Kalman Filter and Beyond
Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationPILCO: A Model-Based and Data-Efficient Approach to Policy Search
PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol
More information6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008
MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, Networks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationBayesian Inference and MCMC
Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the
More informationLearning Bayesian Networks for Biomedical Data
Learning Bayesian Networks for Biomedical Data Faming Liang (Texas A&M University ) Liang, F. and Zhang, J. (2009) Learning Bayesian Networks for Discrete Data. Computational Statistics and Data Analysis,
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationCS Lecture 3. More Bayesian Networks
CS 6347 Lecture 3 More Bayesian Networks Recap Last time: Complexity challenges Representing distributions Computing probabilities/doing inference Introduction to Bayesian networks Today: D-separation,
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationProbabilistic Graphical Networks: Definitions and Basic Results
This document gives a cursory overview of Probabilistic Graphical Networks. The material has been gleaned from different sources. I make no claim to original authorship of this material. Bayesian Graphical
More informationBayesian Methods. David S. Rosenberg. New York University. March 20, 2018
Bayesian Methods David S. Rosenberg New York University March 20, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 March 20, 2018 1 / 38 Contents 1 Classical Statistics 2 Bayesian
More informationBayesian Networks Inference with Probabilistic Graphical Models
4190.408 2016-Spring Bayesian Networks Inference with Probabilistic Graphical Models Byoung-Tak Zhang intelligence Lab Seoul National University 4190.408 Artificial (2016-Spring) 1 Machine Learning? Learning
More informationLecture 2: From Linear Regression to Kalman Filter and Beyond
Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 13: Learning in Gaussian Graphical Models, Non-Gaussian Inference, Monte Carlo Methods Some figures
More informationOverfitting, Bias / Variance Analysis
Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic
More informationConditional Likelihood Maximization: A Unifying Framework for Information Theoretic Feature Selection
Conditional Likelihood Maximization: A Unifying Framework for Information Theoretic Feature Selection Gavin Brown, Adam Pocock, Mingjie Zhao and Mikel Lujan School of Computer Science University of Manchester
More informationGraphical Models and Kernel Methods
Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.
More informationDAG models and Markov Chain Monte Carlo methods a short overview
DAG models and Markov Chain Monte Carlo methods a short overview Søren Højsgaard Institute of Genetics and Biotechnology University of Aarhus August 18, 2008 Printed: August 18, 2008 File: DAGMC-Lecture.tex
More informationLearning in Bayesian Networks
Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks
More informationLecture 5: GPs and Streaming regression
Lecture 5: GPs and Streaming regression Gaussian Processes Information gain Confidence intervals COMP-652 and ECSE-608, Lecture 5 - September 19, 2017 1 Recall: Non-parametric regression Input space X
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationAnnouncements. Proposals graded
Announcements Proposals graded Kevin Jamieson 2018 1 Bayesian Methods Machine Learning CSE546 Kevin Jamieson University of Washington November 1, 2018 2018 Kevin Jamieson 2 MLE Recap - coin flips Data:
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More information11. Learning graphical models
Learning graphical models 11-1 11. Learning graphical models Maximum likelihood Parameter learning Structural learning Learning partially observed graphical models Learning graphical models 11-2 statistical
More informationPerformance Guarantees for Information Theoretic Active Inference
Performance Guarantees for Information Theoretic Active Inference Jason L. Williams, John W. Fisher III and Alan S. Willsky Laboratory for Information and Decision Systems and Computer Science and Artificial
More informationEfficient Sensor Network Planning Method. Using Approximate Potential Game
Efficient Sensor Network Planning Method 1 Using Approximate Potential Game Su-Jin Lee, Young-Jin Park, and Han-Lim Choi, Member, IEEE arxiv:1707.00796v1 [cs.gt] 4 Jul 2017 Abstract This paper addresses
More informationLecture 4 October 18th
Directed and undirected graphical models Fall 2017 Lecture 4 October 18th Lecturer: Guillaume Obozinski Scribe: In this lecture, we will assume that all random variables are discrete, to keep notations
More informationLecture 8: Bayesian Estimation of Parameters in State Space Models
in State Space Models March 30, 2016 Contents 1 Bayesian estimation of parameters in state space models 2 Computational methods for parameter estimation 3 Practical parameter estimation in state space
More informationSequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007
Sequential Monte Carlo and Particle Filtering Frank Wood Gatsby, November 2007 Importance Sampling Recall: Let s say that we want to compute some expectation (integral) E p [f] = p(x)f(x)dx and we remember
More information10 Robotic Exploration and Information Gathering
NAVARCH/EECS 568, ROB 530 - Winter 2018 10 Robotic Exploration and Information Gathering Maani Ghaffari April 2, 2018 Robotic Information Gathering: Exploration and Monitoring In information gathering
More informationBayesian Approaches Data Mining Selected Technique
Bayesian Approaches Data Mining Selected Technique Henry Xiao xiao@cs.queensu.ca School of Computing Queen s University Henry Xiao CISC 873 Data Mining p. 1/17 Probabilistic Bases Review the fundamentals
More informationClassification & Information Theory Lecture #8
Classification & Information Theory Lecture #8 Introduction to Natural Language Processing CMPSCI 585, Fall 2007 University of Massachusetts Amherst Andrew McCallum Today s Main Points Automatically categorizing
More informationBayesian inference J. Daunizeau
Bayesian inference J. Daunizeau Brain and Spine Institute, Paris, France Wellcome Trust Centre for Neuroimaging, London, UK Overview of the talk 1 Probabilistic modelling and representation of uncertainty
More informationIntroduction to Probabilistic Machine Learning
Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning
More informationSampling Algorithms for Probabilistic Graphical models
Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir
More informationDirected Graphical Models
Directed Graphical Models Instructor: Alan Ritter Many Slides from Tom Mitchell Graphical Models Key Idea: Conditional independence assumptions useful but Naïve Bayes is extreme! Graphical models express
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin
More informationDistributed Bayesian Learning with Stochastic Natural-gradient EP and the Posterior Server
Distributed Bayesian Learning with Stochastic Natural-gradient EP and the Posterior Server in collaboration with: Minjie Xu, Balaji Lakshminarayanan, Leonard Hasenclever, Thibaut Lienart, Stefan Webb,
More informationLearning Bayesian networks
1 Lecture topics: Learning Bayesian networks from data maximum likelihood, BIC Bayesian, marginal likelihood Learning Bayesian networks There are two problems we have to solve in order to estimate Bayesian
More informationProbabilistic and Bayesian Machine Learning
Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/
More informationSummary of the Bayes Net Formalism. David Danks Institute for Human & Machine Cognition
Summary of the Bayes Net Formalism David Danks Institute for Human & Machine Cognition Bayesian Networks Two components: 1. Directed Acyclic Graph (DAG) G: There is a node for every variable D: Some nodes
More informationOrganization. I MCMC discussion. I project talks. I Lecture.
Organization I MCMC discussion I project talks. I Lecture. Content I Uncertainty Propagation Overview I Forward-Backward with an Ensemble I Model Reduction (Intro) Uncertainty Propagation in Causal Systems
More informationParametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a
Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Some slides are due to Christopher Bishop Limitations of K-means Hard assignments of data points to clusters small shift of a
More informationThe Origin of Deep Learning. Lili Mou Jan, 2015
The Origin of Deep Learning Lili Mou Jan, 2015 Acknowledgment Most of the materials come from G. E. Hinton s online course. Outline Introduction Preliminary Boltzmann Machines and RBMs Deep Belief Nets
More informationBayesian Networks. Motivation
Bayesian Networks Computer Sciences 760 Spring 2014 http://pages.cs.wisc.edu/~dpage/cs760/ Motivation Assume we have five Boolean variables,,,, The joint probability is,,,, How many state configurations
More informationLecture 16 Deep Neural Generative Models
Lecture 16 Deep Neural Generative Models CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 22, 2017 Approach so far: We have considered simple models and then constructed
More informationParameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets
Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets Matthias Katzfuß Advisor: Dr. Noel Cressie Department of Statistics The Ohio State University
More informationVCMC: Variational Consensus Monte Carlo
VCMC: Variational Consensus Monte Carlo Maxim Rabinovich, Elaine Angelino, Michael I. Jordan Berkeley Vision and Learning Center September 22, 2015 probabilistic models! sky fog bridge water grass object
More informationA graph contains a set of nodes (vertices) connected by links (edges or arcs)
BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationUnequal Error Protection Querying Policies for the Noisy 20 Questions Problem
Unequal Error Protection Querying Policies for the Noisy 20 Questions Problem Hye Won Chung, Brian M. Sadler, Lizhong Zheng and Alfred O. Hero arxiv:606.09233v2 [cs.it] 28 Sep 207 Abstract In this paper,
More informationA Bayesian perspective on GMM and IV
A Bayesian perspective on GMM and IV Christopher A. Sims Princeton University sims@princeton.edu November 26, 2013 What is a Bayesian perspective? A Bayesian perspective on scientific reporting views all
More informationRecall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem
Recall from last time: Conditional probabilities Our probabilistic models will compute and manipulate conditional probabilities. Given two random variables X, Y, we denote by Lecture 2: Belief (Bayesian)
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationPredictive Variance Reduction Search
Predictive Variance Reduction Search Vu Nguyen, Sunil Gupta, Santu Rana, Cheng Li, Svetha Venkatesh Centre of Pattern Recognition and Data Analytics (PRaDA), Deakin University Email: v.nguyen@deakin.edu.au
More informationClustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.
Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)
More informationMachine learning - HT Maximum Likelihood
Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce
More informationVariational Scoring of Graphical Model Structures
Variational Scoring of Graphical Model Structures Matthew J. Beal Work with Zoubin Ghahramani & Carl Rasmussen, Toronto. 15th September 2003 Overview Bayesian model selection Approximations using Variational
More informationBagging During Markov Chain Monte Carlo for Smoother Predictions
Bagging During Markov Chain Monte Carlo for Smoother Predictions Herbert K. H. Lee University of California, Santa Cruz Abstract: Making good predictions from noisy data is a challenging problem. Methods
More informationCSC321 Lecture 18: Learning Probabilistic Models
CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling
More informationA Tutorial on Learning with Bayesian Networks
A utorial on Learning with Bayesian Networks David Heckerman Presented by: Krishna V Chengavalli April 21 2003 Outline Introduction Different Approaches Bayesian Networks Learning Probabilities and Structure
More informationBioinformatics: Biology X
Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA Model Building/Checking, Reverse Engineering, Causality Outline 1 Bayesian Interpretation of Probabilities 2 Where (or of what)
More informationPartially Observable Markov Decision Processes (POMDPs)
Partially Observable Markov Decision Processes (POMDPs) Sachin Patil Guest Lecture: CS287 Advanced Robotics Slides adapted from Pieter Abbeel, Alex Lee Outline Introduction to POMDPs Locally Optimal Solutions
More informationLecture 4: Probabilistic Learning
DD2431 Autumn, 2015 1 Maximum Likelihood Methods Maximum A Posteriori Methods Bayesian methods 2 Classification vs Clustering Heuristic Example: K-means Expectation Maximization 3 Maximum Likelihood Methods
More informationAccouncements. You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF
Accouncements You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF Please do not zip these files and submit (unless there are >5 files) 1 Bayesian Methods Machine Learning
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationExpectation propagation for signal detection in flat-fading channels
Expectation propagation for signal detection in flat-fading channels Yuan Qi MIT Media Lab Cambridge, MA, 02139 USA yuanqi@media.mit.edu Thomas Minka CMU Statistics Department Pittsburgh, PA 15213 USA
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA Contents in latter part Linear Dynamical Systems What is different from HMM? Kalman filter Its strength and limitation Particle Filter
More informationSLAM Techniques and Algorithms. Jack Collier. Canada. Recherche et développement pour la défense Canada. Defence Research and Development Canada
SLAM Techniques and Algorithms Jack Collier Defence Research and Development Canada Recherche et développement pour la défense Canada Canada Goals What will we learn Gain an appreciation for what SLAM
More informationNotes on Machine Learning for and
Notes on Machine Learning for 16.410 and 16.413 (Notes adapted from Tom Mitchell and Andrew Moore.) Choosing Hypotheses Generally want the most probable hypothesis given the training data Maximum a posteriori
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of
More informationApril 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning
for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions
More informationParameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn
Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation
More informationTime Series and Dynamic Models
Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationLecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu
Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes
More informationProbabilistic Graphical Models (I)
Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random
More informationProbability and Information Theory. Sargur N. Srihari
Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal
More informationSome Probability and Statistics
Some Probability and Statistics David M. Blei COS424 Princeton University February 13, 2012 Card problem There are three cards Red/Red Red/Black Black/Black I go through the following process. Close my
More informationCOMP 551 Applied Machine Learning Lecture 19: Bayesian Inference
COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference Associate Instructor: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted
More informationBayesian Analysis of Risk for Data Mining Based on Empirical Likelihood
1 / 29 Bayesian Analysis of Risk for Data Mining Based on Empirical Likelihood Yuan Liao Wenxin Jiang Northwestern University Presented at: Department of Statistics and Biostatistics Rutgers University
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 12: Gaussian Belief Propagation, State Space Models and Kalman Filters Guest Kalman Filter Lecture by
More informationStein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm
Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm Qiang Liu and Dilin Wang NIPS 2016 Discussion by Yunchen Pu March 17, 2017 March 17, 2017 1 / 8 Introduction Let x R d
More informationModeling and state estimation Examples State estimation Probabilities Bayes filter Particle filter. Modeling. CSC752 Autonomous Robotic Systems
Modeling CSC752 Autonomous Robotic Systems Ubbo Visser Department of Computer Science University of Miami February 21, 2017 Outline 1 Modeling and state estimation 2 Examples 3 State estimation 4 Probabilities
More informationRobotics. Mobile Robotics. Marc Toussaint U Stuttgart
Robotics Mobile Robotics State estimation, Bayes filter, odometry, particle filter, Kalman filter, SLAM, joint Bayes filter, EKF SLAM, particle SLAM, graph-based SLAM Marc Toussaint U Stuttgart DARPA Grand
More informationLecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010
Hidden Lecture 4: Hidden : An Introduction to Dynamic Decision Making November 11, 2010 Special Meeting 1/26 Markov Model Hidden When a dynamical system is probabilistic it may be determined by the transition
More information