Ben Pence Scott Moura. December 12, 2009
|
|
- Emerald Lynch
- 5 years ago
- Views:
Transcription
1 Ben Pence Scott Moura EECS 558 Project December 12, 2009 SLIDE 1of 24
2 Stochastic system with unknown parameters Want to identify the parameters Adapt the control based on the parameter estimate Identifiability Estimation Adaptive Control Results With Results W/O Identifiability Identifiability Mandl(1973) Borkar, Varaiya (1979,82) Conclusions SLIDE 2of 24
3 Model: Controlled Markov Chain P ij ( u, ) A u U U Transition Probability Matrix from state ito j; i,jœs =Unknown Parameter, A=Parameter Space u=control Action, U=Control Space Identifiability condition: If βin A, then P ( u, ) P ( u, β ) ij ij For anyu œu SLIDE 3of 24
4 P ij = 1 0 u {1,2} {0.1,0.2,0.3} Identifiable Not Identifiable P ij u u = 1 0 u {1,2} {0.1,0.2,0.3} Identifiable Not Identifiable P ij u u = 1 0 u {1,4} {0.1,0.2,0.3} Identifiable Not Identifiable SLIDE 4of 24
5 Maximum Likelihood Estimator States What value of makes this trajectory most likely? ˆ = t arg max t 1 s= 0 P( x s, x s+ 1 ; us, ) Probability of transitioning from x s to x s+1 given u s and SLIDE 5of 24
6 For each A a SMP has been prespecified U = g X t ( ) t At each time step, apply the SMP corresponding to the parameter estimate ˆt ˆt U = g X t ( ) t SLIDE 6of 24
7 Identifiability Estimation Adaptive Control Results With Results W/O Identifiability Identifiability Mandl(1973) Borkar, Varaiya (1979,82) Conclusions SLIDE 7of 24
8 Theorem 1 For the model under consideration, with the assumptions previously outlined (indentifiability), there exists random time Tsuch that for t T ˆt * = Parameter estimate reaches limit point = * 0 Limit point is true parameter value! ( * * ) ( 0 ) P i, j; g ( i), = P i, j; g ( i), i, j S Moreover, closed-loop transition probabilities are the same as if we had known the true parameter 0 SLIDE 8of 24
9 Markov Chain S = {1, 2} U = {1,2} 1 A = {0.1, 0.2, 0.3} Transition Prob s Parameter Estimate, P ij = True parameter = Converges to true parameter value Time, t Control Law ( ) ( ) ( ) 0.1 u g i = = 0.2 u g i 0.3 u g i 2 = = 1 = = 2 SLIDE 9of 24
10 Identifiability Estimation Adaptive Control Results With Results W/O Identifiability Identifiability Mandl(1973) Borkar, Varaiya (1979,82) Conclusions SLIDE 10of 24
11 Hypothesis The identifiability assumption is too restrictive P ij u u u {1,2 } = 1 0 {0.1,0.2,0.3} } Identifiable Not Identifiable Question What happens if we relax this assumption? SLIDE 11of 24
12 Theorem 2 For the model under consideration, with the assumptions previously outlined except indentifiability, there exists random time Tsuch that for t T ˆt * = Parameter estimate reaches limit point But this limit point is not necessarily equal to the true parameter! ( * * ) ( * 0 ) P i, j; g ( i), = P i, j; g ( i), i, j S However, closed-loop transition probabilities are the same 1. for true system and some imaginary system where the parameter really was * 2. control law corresponding to * is applied SLIDE 12of 24
13 Markov Chain Transition Prob s P ij u u u u = S U = {1, 2} = {1,2} A = {0.1, 0.2, 0.3} True parameter 0 = 0.2 Control, u Parameter Estimate, Lack of identifiabilityresults in control stuck at u = 2 Converges to INCORRECT parameter values Control Law EECS 558 Project Identifiability Time, t & Adaptive Control of Markov Chains ( ) ( ) ( ) 0.1 u g i = = 0.2 u g i 0.3 u g i 2 = = 1 = = 2 SLIDE 13of 24
14 Recall the objectives: Estimate the unknown parameter Satisfactorily control the Markov chain? Source of Difficulty: Only uses control U k = g * (X k ) Can only identify subset {P ij (u) u=g * (i)} of all transition probs {P ij (u) u U} Adaptive controller should probe outside subset SLIDE 14of 24
15 Main Idea Probe outside the subset of transition probabilities by perturbing control signals randomly Key Assumption: Partial Identifiability If βin A,then the MC is partially identifiableif in every open set OœU, there exist uœos.t. P ij ( u, ) P ( u, β ) ij SLIDE 15of 24
16 Construction of Randomized Controls Current parameter estimate = Control u = g ( x) Construct an open ball B ε (u t ) for each u t Radius ε > 0small centered at u Construct a probability measure µ on U which assigns probabilities to open balls Perform independent experiment to generate random control u t from B ε (g (x t )) using µ µ u t u = g B ε (g (x t )) ε ( x) SLIDE 16of 24
17 Theorem 3 For the model under consideration, with the assumptions previously outlined including partial indentifiability, then under any ε-randomization of g lim ˆ t t * = Parameter estimate reaches limit point = * 0 Limit point is true parameter value! ( * * ) ( 0 ) P i, j; g ( i), = P i, j; g ( i), i, j S Moreover, closed-loop transition probabilities are the same as if we had known the true parameter 0 SLIDE 17of 24
18 Markov Chain Transition Prob s P ij u u u u = 1 0 S = {1,2} U = {1, 2} A = {0.1, 0.2, 0.3} True parameter 0 = 0.2 Control, u Parameter Estimate, Randomized control explores outside of u = 2 Converges to true parameter Time, t Randomized Control Law P( u = 2 {0.1, 0.3}) = 0.75 P( u = 1 {0.1, 0.3}) = 0.25 P( u = 1 {0.2}) = 0.75 P( u = 2 {0.2}) = 0.25 Control Probe Control Probe SLIDE 18of 24
19 Recall the objectives: Estimate the unknown parameter Satisfactorily control the Markov chain Summary: Relaxed restrictive identifiability condition through local probing SLIDE 19of 24
20 Identifiability Estimation Adaptive Control Results With Results W/O Identifiability Identifiability Mandl(1973) Borkar, Varaiya (1979,82) Conclusions SLIDE 20of 24
21 Simultaneously identify and control a Markov chain with unknown parameters MLE converges to true parameter in closed-loop, under identifiability condition Without identifiabilitycondition, MLE may not converge to true parameter Through probing control only need partial identifiability to converge to the true parameter SLIDE 21of 24
22 Theoretical Questions What if 0 A? Can control randomization be stopped eventually? Alternative estimation methods? Application Areas Adaptive control of hybrid vehicles through drive cycle identification Algorithms for calculating the maximum likelihood estimate SLIDE 22of 24
23 Thank you for your attention! Questions? Comments? Suggestions? SLIDE 23of 24
24 Appendix Slides SLIDE 24of 24
25 Maximum Likelihood Estimator t 1 ( ) = (, ;, ) L P X X U t t t+ 1 t s= 0 ˆ = t arg max ( ) { L } t Likelihood Ratio ( ) Λ = t L L t t ( ) ( 0 ) SLIDE 25of 24
26 Construction of Randomized Controls Suppose current parameter estimate is and corresponding control is u = g ( x) Construct an open ball with radius ε > 0small centered at u, denotedby B ε (u t ), for each Construct a probability measureµon Uwhich assigns probabilities to open balls Performindependent experiment to generate random control u t from B ε (g (x t )) usingµ Sequence {u t, t= 0,1, } is called the ε-randomization of g u U SLIDE 26of 24
27 Model and Problem Formulation Main Results under Identifiability(1 st paper) Results without Identifiability(2 nd paper) Control Randomization (3 rd paper) Concluding Remarks SLIDE 27of 24
28 Motivation The identifiability assumption is too restrictive For every βin A, there exist at least one i S [ P ( i,1; u, ) P ( i,2; u, ) L P ( i, I ; u, ) ] [ P( i,1; u, β ) P( i,2; u, β ) L P( i, I; u, β )] u U Question What happens if we relax this assumption? SLIDE 28of 24
29 Model and Problem Formulation Main Results under Identifiability(1 st paper) Results without Identifiability(2 nd paper) Control Randomization (3 rd paper) Concluding Remarks SLIDE 29of 24
30 Controlled Markov chain (MC) Finite state space S = {1, 2,, I} Finite action space U Finite parameter space A Transition probabilities P( i, j; u, ) i, j S u U A { } Stationary Markov policy (SMP) U t = g(x t ) Perfect Recall SLIDE 30of 24
31 Estimate the unknown parameter Adaptively Control the Markov chain Analyze the asymptotic convergence properties Convergence or not? What assumptions are necessary? Quantify performance of adaptive MC SLIDE 31of 24
32 1. Controlled MC is irreducible, 0 2. True parameter A u U, 3. The SMP corresponding to, g, precomputed 4. Identifiability condition: If βin A, then P ( u, ) P ( u, β ) ij ij For anyu œu A SLIDE 32of 24
33 Model and Problem Formulation Main Results under Identifiability(1 st paper) Results without Identifiability(2 nd paper) Control Randomization (3 rd paper) Concluding Remarks SLIDE 33of 24
Markov Processes Hamid R. Rabiee
Markov Processes Hamid R. Rabiee Overview Markov Property Markov Chains Definition Stationary Property Paths in Markov Chains Classification of States Steady States in MCs. 2 Markov Property A discrete
More informationApproximate Bayesian Computation and Particle Filters
Approximate Bayesian Computation and Particle Filters Dennis Prangle Reading University 5th February 2014 Introduction Talk is mostly a literature review A few comments on my own ongoing research See Jasra
More informationTheory of Stochastic Processes 8. Markov chain Monte Carlo
Theory of Stochastic Processes 8. Markov chain Monte Carlo Tomonari Sei sei@mist.i.u-tokyo.ac.jp Department of Mathematical Informatics, University of Tokyo June 8, 2017 http://www.stat.t.u-tokyo.ac.jp/~sei/lec.html
More informationStat 516, Homework 1
Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball
More informationConfidence intervals for the mixing time of a reversible Markov chain from a single sample path
Confidence intervals for the mixing time of a reversible Markov chain from a single sample path Daniel Hsu Aryeh Kontorovich Csaba Szepesvári Columbia University, Ben-Gurion University, University of Alberta
More informationBayesian Inference for Clustered Extremes
Newcastle University, Newcastle-upon-Tyne, U.K. lee.fawcett@ncl.ac.uk 20th TIES Conference: Bologna, Italy, July 2009 Structure of this talk 1. Motivation and background 2. Review of existing methods Limitations/difficulties
More informationSection Notes 9. Midterm 2 Review. Applied Math / Engineering Sciences 121. Week of December 3, 2018
Section Notes 9 Midterm 2 Review Applied Math / Engineering Sciences 121 Week of December 3, 2018 The following list of topics is an overview of the material that was covered in the lectures and sections
More informationSTOCHASTIC MODELS LECTURE 1 MARKOV CHAINS. Nan Chen MSc Program in Financial Engineering The Chinese University of Hong Kong (ShenZhen) Sept.
STOCHASTIC MODELS LECTURE 1 MARKOV CHAINS Nan Chen MSc Program in Financial Engineering The Chinese University of Hong Kong (ShenZhen) Sept. 6, 2016 Outline 1. Introduction 2. Chapman-Kolmogrov Equations
More informationLecture 10: Powers of Matrices, Difference Equations
Lecture 10: Powers of Matrices, Difference Equations Difference Equations A difference equation, also sometimes called a recurrence equation is an equation that defines a sequence recursively, i.e. each
More information= P{X 0. = i} (1) If the MC has stationary transition probabilities then, = i} = P{X n+1
Properties of Markov Chains and Evaluation of Steady State Transition Matrix P ss V. Krishnan - 3/9/2 Property 1 Let X be a Markov Chain (MC) where X {X n : n, 1, }. The state space is E {i, j, k, }. The
More informationThe Theory behind PageRank
The Theory behind PageRank Mauro Sozio Telecom ParisTech May 21, 2014 Mauro Sozio (LTCI TPT) The Theory behind PageRank May 21, 2014 1 / 19 A Crash Course on Discrete Probability Events and Probability
More informationOptimality conditions for unconstrained optimization. Outline
Optimality conditions for unconstrained optimization Daniel P. Robinson Department of Applied Mathematics and Statistics Johns Hopkins University September 13, 2018 Outline 1 The problem and definitions
More informationMarkov Model. Model representing the different resident states of a system, and the transitions between the different states
Markov Model Model representing the different resident states of a system, and the transitions between the different states (applicable to repairable, as well as non-repairable systems) System behavior
More informationHomework 4 due on Thursday, December 15 at 5 PM (hard deadline).
Large-Time Behavior for Continuous-Time Markov Chains Friday, December 02, 2011 10:58 AM Homework 4 due on Thursday, December 15 at 5 PM (hard deadline). How are formulas for large-time behavior of discrete-time
More informationFisher Information Determines Capacity of ε-secure Steganography
Fisher Information Determines Capacity of ε-secure Steganography Tomáš Filler and Jessica Fridrich Dept. of Electrical and Computer Engineering SUNY Binghamton, New York 11th Information Hiding, Darmstadt,
More informationLecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321
Lecture 11: Introduction to Markov Chains Copyright G. Caire (Sample Lectures) 321 Discrete-time random processes A sequence of RVs indexed by a variable n 2 {0, 1, 2,...} forms a discretetime random process
More information8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains
8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains 8.1 Review 8.2 Statistical Equilibrium 8.3 Two-State Markov Chain 8.4 Existence of P ( ) 8.5 Classification of States
More informationDecentralized Control of Stochastic Systems
Decentralized Control of Stochastic Systems Sanjay Lall Stanford University CDC-ECC Workshop, December 11, 2005 2 S. Lall, Stanford 2005.12.11.02 Decentralized Control G 1 G 2 G 3 G 4 G 5 y 1 u 1 y 2 u
More informationRecap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks
Recap Probability, stochastic processes, Markov chains ELEC-C7210 Modeling and analysis of communication networks 1 Recap: Probability theory important distributions Discrete distributions Geometric distribution
More informationMarkov decision processes
CS 2740 Knowledge representation Lecture 24 Markov decision processes Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Administrative announcements Final exam: Monday, December 8, 2008 In-class Only
More informationBasics of reinforcement learning
Basics of reinforcement learning Lucian Buşoniu TMLSS, 20 July 2018 Main idea of reinforcement learning (RL) Learn a sequential decision policy to optimize the cumulative performance of an unknown system
More informationHamiltonian Cycle Problem and Cross Entropy. Passing Through the Nodes by Learning
: Passing Through the Nodes by Learning Centre for Industrial and Applied Mathematics (CIAM) University of South Australia, Mawson Lakes SA 5095 Australia A Conference on the Occasion of R.Y. Rubinstein
More informationConvergence Rate of Markov Chains
Convergence Rate of Markov Chains Will Perkins April 16, 2013 Convergence Last class we saw that if X n is an irreducible, aperiodic, positive recurrent Markov chain, then there exists a stationary distribution
More informationLECTURE NOTES: Discrete time Markov Chains (2/24/03; SG)
LECTURE NOTES: Discrete time Markov Chains (2/24/03; SG) A discrete time Markov Chains is a stochastic dynamical system in which the probability of arriving in a particular state at a particular time moment
More informationNeural, Parallel, and Scientific Computations 18 (2010) A STOCHASTIC APPROACH TO THE BONUS-MALUS SYSTEM 1. INTRODUCTION
Neural, Parallel, and Scientific Computations 18 (2010) 283-290 A STOCHASTIC APPROACH TO THE BONUS-MALUS SYSTEM KUMER PIAL DAS Department of Mathematics, Lamar University, Beaumont, Texas 77710, USA. ABSTRACT.
More informationHill climbing: Simulated annealing and Tabu search
Hill climbing: Simulated annealing and Tabu search Heuristic algorithms Giovanni Righini University of Milan Department of Computer Science (Crema) Hill climbing Instead of repeating local search, it is
More informationThe Contraction Mapping Theorem and the Implicit and Inverse Function Theorems
The Contraction Mapping Theorem and the Implicit and Inverse Function Theorems The Contraction Mapping Theorem Theorem The Contraction Mapping Theorem) Let B a = { x IR d x < a } denote the open ball of
More informationTMA 4265 Stochastic Processes Semester project, fall 2014 Student number and
TMA 4265 Stochastic Processes Semester project, fall 2014 Student number 730631 and 732038 Exercise 1 We shall study a discrete Markov chain (MC) {X n } n=0 with state space S = {0, 1, 2, 3, 4, 5, 6}.
More informationValue and Policy Iteration
Chapter 7 Value and Policy Iteration 1 For infinite horizon problems, we need to replace our basic computational tool, the DP algorithm, which we used to compute the optimal cost and policy for finite
More informationOutlines. Discrete Time Markov Chain (DTMC) Continuous Time Markov Chain (CTMC)
Markov Chains (2) Outlines Discrete Time Markov Chain (DTMC) Continuous Time Markov Chain (CTMC) 2 pj ( n) denotes the pmf of the random variable p ( n) P( X j) j We will only be concerned with homogenous
More informationNon-homogeneous random walks on a semi-infinite strip
Non-homogeneous random walks on a semi-infinite strip Chak Hei Lo Joint work with Andrew R. Wade World Congress in Probability and Statistics 11th July, 2016 Outline Motivation: Lamperti s problem Our
More informationMarkov Chain Model for ALOHA protocol
Markov Chain Model for ALOHA protocol Laila Daniel and Krishnan Narayanan April 22, 2012 Outline of the talk A Markov chain (MC) model for Slotted ALOHA Basic properties of Discrete-time Markov Chain Stability
More informationEcon 583 Homework 7 Suggested Solutions: Wald, LM and LR based on GMM and MLE
Econ 583 Homework 7 Suggested Solutions: Wald, LM and LR based on GMM and MLE Eric Zivot Winter 013 1 Wald, LR and LM statistics based on generalized method of moments estimation Let 1 be an iid sample
More informationCDA6530: Performance Models of Computers and Networks. Chapter 3: Review of Practical Stochastic Processes
CDA6530: Performance Models of Computers and Networks Chapter 3: Review of Practical Stochastic Processes Definition Stochastic process X = {X(t), t2 T} is a collection of random variables (rvs); one rv
More information16.410/413 Principles of Autonomy and Decision Making
16.410/413 Principles of Autonomy and Decision Making Lecture 23: Markov Decision Processes Policy Iteration Emilio Frazzoli Aeronautics and Astronautics Massachusetts Institute of Technology December
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of
More informationMarkov Decision Processes Infinite Horizon Problems
Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld 1 What is a solution to an MDP? MDP Planning Problem: Input: an MDP (S,A,R,T)
More informationPROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS
PROBABILITY: LIMIT THEOREMS II, SPRING 218. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please
More informationLecture 6: Markov Chain Monte Carlo
Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline
More informationSpring 2012 Math 541B Exam 1
Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote
More informationOn Finding Optimal Policies for Markovian Decision Processes Using Simulation
On Finding Optimal Policies for Markovian Decision Processes Using Simulation Apostolos N. Burnetas Case Western Reserve University Michael N. Katehakis Rutgers University February 1995 Abstract A simulation
More informationLEARNING DYNAMIC SYSTEMS: MARKOV MODELS
LEARNING DYNAMIC SYSTEMS: MARKOV MODELS Markov Process and Markov Chains Hidden Markov Models Kalman Filters Types of dynamic systems Problem of future state prediction Predictability Observability Easily
More informationEECS 495: Randomized Algorithms Lecture 14 Random Walks. j p ij = 1. Pr[X t+1 = j X 0 = i 0,..., X t 1 = i t 1, X t = i] = Pr[X t+
EECS 495: Randomized Algorithms Lecture 14 Random Walks Reading: Motwani-Raghavan Chapter 6 Powerful tool for sampling complicated distributions since use only local moves Given: to explore state space.
More informationSection 21. The Metric Topology (Continued)
21. The Metric Topology (cont.) 1 Section 21. The Metric Topology (Continued) Note. In this section we give a number of results for metric spaces which are familar from calculus and real analysis. We also
More informationThe Contraction Mapping Theorem and the Implicit Function Theorem
The Contraction Mapping Theorem and the Implicit Function Theorem Theorem The Contraction Mapping Theorem Let B a = { x IR d x < a } denote the open ball of radius a centred on the origin in IR d. If the
More informationParameter Synthesis for Parametric Interval Markov Chains
Parameter Synthesis for Parametric Interval Markov Chains Benoît Delahaye Didier Lime 2 Laure Petrucci 3 Université de Nantes, LINA 2 École Centrale de Nantes, IRCCyN 3 Université Paris 3, LIPN MeFoSyLoMa,
More information18.600: Lecture 32 Markov Chains
18.600: Lecture 32 Markov Chains Scott Sheffield MIT Outline Markov chains Examples Ergodicity and stationarity Outline Markov chains Examples Ergodicity and stationarity Markov chains Consider a sequence
More informationRobert Collins CSE586, PSU. Markov-Chain Monte Carlo
Markov-Chain Monte Carlo References Problem Intuition: In high dimension problems, the Typical Set (volume of nonnegligable prob in state space) is a small fraction of the total space. High-Dimensional
More information6 Basic Convergence Results for RL Algorithms
Learning in Complex Systems Spring 2011 Lecture Notes Nahum Shimkin 6 Basic Convergence Results for RL Algorithms We establish here some asymptotic convergence results for the basic RL algorithms, by showing
More informationProblem Solving. Undergraduate Physics
in Undergraduate Physics In this brief chapter, we will learn (or possibly review) several techniques for successful problem solving in undergraduate physics courses Slide 1 Procedure for The following
More informationMinicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics
Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics Eric Slud, Statistics Program Lecture 1: Metropolis-Hastings Algorithm, plus background in Simulation and Markov Chains. Lecture
More informationGMM and SMM. 1. Hansen, L Large Sample Properties of Generalized Method of Moments Estimators, Econometrica, 50, p
GMM and SMM Some useful references: 1. Hansen, L. 1982. Large Sample Properties of Generalized Method of Moments Estimators, Econometrica, 50, p. 1029-54. 2. Lee, B.S. and B. Ingram. 1991 Simulation estimation
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationUtilizing Network Structure to Accelerate Markov Chain Monte Carlo Algorithms
algorithms Article Utilizing Network Structure to Accelerate Markov Chain Monte Carlo Algorithms Ahmad Askarian, Rupei Xu and András Faragó * Department of Computer Science, The University of Texas at
More informationMIXING TIMES OF RANDOM WALKS ON DYNAMIC CONFIGURATION MODELS
MIXING TIMES OF RANDOM WALKS ON DYNAMIC CONFIGURATION MODELS Frank den Hollander Leiden University, The Netherlands New Results and Challenges with RWDRE, 11 13 January 2017, Heilbronn Institute for Mathematical
More informationThe Art of Sequential Optimization via Simulations
The Art of Sequential Optimization via Simulations Stochastic Systems and Learning Laboratory EE, CS* & ISE* Departments Viterbi School of Engineering University of Southern California (Based on joint
More informationAlmost Sure Convergence of Two Time-Scale Stochastic Approximation Algorithms
Almost Sure Convergence of Two Time-Scale Stochastic Approximation Algorithms Vladislav B. Tadić Abstract The almost sure convergence of two time-scale stochastic approximation algorithms is analyzed under
More informationAsymptotic properties of the maximum likelihood estimator for a ballistic random walk in a random environment
Asymptotic properties of the maximum likelihood estimator for a ballistic random walk in a random environment Catherine Matias Joint works with F. Comets, M. Falconnet, D.& O. Loukianov Currently: Laboratoire
More informationCountable state discrete time Markov Chains
Countable state discrete time Markov Chains Tuesday, March 18, 2014 2:12 PM Readings: Lawler Ch. 2 Karlin & Taylor Chs. 2 & 3 Resnick Ch. 1 Countably infinite state spaces are of practical utility in situations
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More informationTheory of Stochastic Processes 3. Generating functions and their applications
Theory of Stochastic Processes 3. Generating functions and their applications Tomonari Sei sei@mist.i.u-tokyo.ac.jp Department of Mathematical Informatics, University of Tokyo April 20, 2017 http://www.stat.t.u-tokyo.ac.jp/~sei/lec.html
More informationMATH 56A: STOCHASTIC PROCESSES CHAPTER 1
MATH 56A: STOCHASTIC PROCESSES CHAPTER. Finite Markov chains For the sake of completeness of these notes I decided to write a summary of the basic concepts of finite Markov chains. The topics in this chapter
More informationAnalysis of Algorithms I: Perfect Hashing
Analysis of Algorithms I: Perfect Hashing Xi Chen Columbia University Goal: Let U = {0, 1,..., p 1} be a huge universe set. Given a static subset V U of n keys (here static means we will never change the
More informationQUANTIZED SYSTEMS AND CONTROL. Daniel Liberzon. DISC HS, June Dept. of Electrical & Computer Eng., Univ. of Illinois at Urbana-Champaign
QUANTIZED SYSTEMS AND CONTROL Daniel Liberzon Coordinated Science Laboratory and Dept. of Electrical & Computer Eng., Univ. of Illinois at Urbana-Champaign DISC HS, June 2003 HYBRID CONTROL Plant: u y
More informationMARKOV MODEL WITH COSTS In Markov models we are often interested in cost calculations.
MARKOV MODEL WITH COSTS In Markov models we are often interested in cost calculations. inventory model: storage costs manpower planning model: salary costs machine reliability model: repair costs We will
More informationCSE 573. Markov Decision Processes: Heuristic Search & Real-Time Dynamic Programming. Slides adapted from Andrey Kolobov and Mausam
CSE 573 Markov Decision Processes: Heuristic Search & Real-Time Dynamic Programming Slides adapted from Andrey Kolobov and Mausam 1 Stochastic Shortest-Path MDPs: Motivation Assume the agent pays cost
More informationAt the boundary states, we take the same rules except we forbid leaving the state space, so,.
Birth-death chains Monday, October 19, 2015 2:22 PM Example: Birth-Death Chain State space From any state we allow the following transitions: with probability (birth) with probability (death) with probability
More informationGibbs Sampling Methods for Multiple Sequence Alignment
Gibbs Sampling Methods for Multiple Sequence Alignment Scott C. Schmidler 1 Jun S. Liu 2 1 Section on Medical Informatics and 2 Department of Statistics Stanford University 11/17/99 1 Outline Statistical
More informationProofs for Large Sample Properties of Generalized Method of Moments Estimators
Proofs for Large Sample Properties of Generalized Method of Moments Estimators Lars Peter Hansen University of Chicago March 8, 2012 1 Introduction Econometrica did not publish many of the proofs in my
More informationApproximate Inference
Approximate Inference Simulation has a name: sampling Sampling is a hot topic in machine learning, and it s really simple Basic idea: Draw N samples from a sampling distribution S Compute an approximate
More informationEstimation and Optimization: Gaps and Bridges. MURI Meeting June 20, Laurent El Ghaoui. UC Berkeley EECS
MURI Meeting June 20, 2001 Estimation and Optimization: Gaps and Bridges Laurent El Ghaoui EECS UC Berkeley 1 goals currently, estimation (of model parameters) and optimization (of decision variables)
More informationOnline Social Networks and Media. Link Analysis and Web Search
Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information
More informationStatistical NLP: Hidden Markov Models. Updated 12/15
Statistical NLP: Hidden Markov Models Updated 12/15 Markov Models Markov models are statistical tools that are useful for NLP because they can be used for part-of-speech-tagging applications Their first
More informationDecision Theory: Markov Decision Processes
Decision Theory: Markov Decision Processes CPSC 322 Lecture 33 March 31, 2006 Textbook 12.5 Decision Theory: Markov Decision Processes CPSC 322 Lecture 33, Slide 1 Lecture Overview Recap Rewards and Policies
More informationPowerful tool for sampling from complicated distributions. Many use Markov chains to model events that arise in nature.
Markov Chains Markov chains: 2SAT: Powerful tool for sampling from complicated distributions rely only on local moves to explore state space. Many use Markov chains to model events that arise in nature.
More information(x x 0 ) 2 + (y y 0 ) 2 = ε 2, (2.11)
2.2 Limits and continuity In order to introduce the concepts of limit and continuity for functions of more than one variable we need first to generalise the concept of neighbourhood of a point from R to
More informationDistributed Robust Localization from Arbitrary Initial Poses via EM and Model Selection
1 Distributed Robust Localization from Arbitrary Initial Poses via EM and Model Selection Vadim Indelman Collaborators: Erik Nelson, Jing Dong, Nathan Michael and Frank Dellaert IACAS, February 15 Collaborative
More informationNote that in the example in Lecture 1, the state Home is recurrent (and even absorbing), but all other states are transient. f ii (n) f ii = n=1 < +
Random Walks: WEEK 2 Recurrence and transience Consider the event {X n = i for some n > 0} by which we mean {X = i}or{x 2 = i,x i}or{x 3 = i,x 2 i,x i},. Definition.. A state i S is recurrent if P(X n
More informationClustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.
Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)
More informationLink Analysis Ranking
Link Analysis Ranking How do search engines decide how to rank your query results? Guess why Google ranks the query results the way it does How would you do it? Naïve ranking of query results Given query
More informationStochastic Shortest Path Problems
Chapter 8 Stochastic Shortest Path Problems 1 In this chapter, we study a stochastic version of the shortest path problem of chapter 2, where only probabilities of transitions along different arcs can
More informationPrediction by conditional simulation
Prediction by conditional simulation C. Lantuéjoul MinesParisTech christian.lantuejoul@mines-paristech.fr Introductive example Problem: A submarine cable has to be laid on the see floor between points
More information3. The Voter Model. David Aldous. June 20, 2012
3. The Voter Model David Aldous June 20, 2012 We now move on to the voter model, which (compared to the averaging model) has a more substantial literature in the finite setting, so what s written here
More informationBayesian Social Learning with Random Decision Making in Sequential Systems
Bayesian Social Learning with Random Decision Making in Sequential Systems Yunlong Wang supervised by Petar M. Djurić Department of Electrical and Computer Engineering Stony Brook University Stony Brook,
More informationAbstract Dynamic Programming
Abstract Dynamic Programming Dimitri P. Bertsekas Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology Overview of the Research Monograph Abstract Dynamic Programming"
More informationDefinition A finite Markov chain is a memoryless homogeneous discrete stochastic process with a finite number of states.
Chapter 8 Finite Markov Chains A discrete system is characterized by a set V of states and transitions between the states. V is referred to as the state space. We think of the transitions as occurring
More informationBayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007
Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.
More informationA random perturbation approach to some stochastic approximation algorithms in optimization.
A random perturbation approach to some stochastic approximation algorithms in optimization. Wenqing Hu. 1 (Presentation based on joint works with Chris Junchi Li 2, Weijie Su 3, Haoyi Xiong 4.) 1. Department
More information18.440: Lecture 33 Markov Chains
18.440: Lecture 33 Markov Chains Scott Sheffield MIT 1 Outline Markov chains Examples Ergodicity and stationarity 2 Outline Markov chains Examples Ergodicity and stationarity 3 Markov chains Consider a
More informationNo class on Thursday, October 1. No office hours on Tuesday, September 29 and Thursday, October 1.
Stationary Distributions Monday, September 28, 2015 2:02 PM No class on Thursday, October 1. No office hours on Tuesday, September 29 and Thursday, October 1. Homework 1 due Friday, October 2 at 5 PM strongly
More informationLecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016
Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,
More information10.8 Further Applications of the Divergence Theorem
10.8 Further Applications of the Divergence Theorem Let D be an open subset of R 3, and let f : D R. Recall that the Laplacian of f, denoted by 2 f, is defined by We say that f is harmonic on D provided
More informationCOPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition
Preface Preface to the First Edition xi xiii 1 Basic Probability Theory 1 1.1 Introduction 1 1.2 Sample Spaces and Events 3 1.3 The Axioms of Probability 7 1.4 Finite Sample Spaces and Combinatorics 15
More informationE190Q Lecture 10 Autonomous Robot Navigation
E190Q Lecture 10 Autonomous Robot Navigation Instructor: Chris Clark Semester: Spring 2015 1 Figures courtesy of Siegwart & Nourbakhsh Kilobots 2 https://www.youtube.com/watch?v=2ialuwgafd0 Control Structures
More informationMATH3200, Lecture 31: Applications of Eigenvectors. Markov Chains and Chemical Reaction Systems
Lecture 31: Some Applications of Eigenvectors: Markov Chains and Chemical Reaction Systems Winfried Just Department of Mathematics, Ohio University April 9 11, 2018 Review: Eigenvectors and left eigenvectors
More informationProbability, Statistics, and Reliability for Engineers and Scientists FUNDAMENTALS OF STATISTICAL ANALYSIS
CHAPTER Probability, Statistics, and Reliability for Engineers and Scientists FUNDAMENTALS OF STATISTICAL ANALYSIS Second Edition A. J. Clark School of Engineering Department of Civil and Environmental
More informationStochastic Processes
Stochastic Processes 8.445 MIT, fall 20 Mid Term Exam Solutions October 27, 20 Your Name: Alberto De Sole Exercise Max Grade Grade 5 5 2 5 5 3 5 5 4 5 5 5 5 5 6 5 5 Total 30 30 Problem :. True / False
More informationAUTOMATIC CONTROL COMMUNICATION SYSTEMS LINKÖPINGS UNIVERSITET. Questions AUTOMATIC CONTROL COMMUNICATION SYSTEMS LINKÖPINGS UNIVERSITET
The Problem Identification of Linear and onlinear Dynamical Systems Theme : Curve Fitting Division of Automatic Control Linköping University Sweden Data from Gripen Questions How do the control surface
More informationOptimal input design for nonlinear dynamical systems: a graph-theory approach
Optimal input design for nonlinear dynamical systems: a graph-theory approach Patricio E. Valenzuela Department of Automatic Control and ACCESS Linnaeus Centre KTH Royal Institute of Technology, Stockholm,
More informationMotivation for introducing probabilities
for introducing probabilities Reaching the goals is often not sufficient: it is important that the expected costs do not outweigh the benefit of reaching the goals. 1 Objective: maximize benefits - costs.
More information