CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Control of Stochastic Systems - P.R. Kumar

Similar documents
An Introduction to Backward Stochastic Differential Equations (BSDEs) PIMS Summer School 2016 in Mathematical Finance.

Georey E. Hinton. University oftoronto. Technical Report CRG-TR February 22, Abstract

Recursive Least-Squares Fixed-Interval Smoother Using Covariance Information based on Innovation Approach in Linear Continuous Stochastic Systems

Notes on Kalman Filtering

Augmented Reality II - Kalman Filters - Gudrun Klinker May 25, 2004

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1

On the Separation Theorem of Stochastic Systems in the Case Of Continuous Observation Channels with Memory

( ) ( ) if t = t. It must satisfy the identity. So, bulkiness of the unit impulse (hyper)function is equal to 1. The defining characteristic is

An Introduction to Malliavin calculus and its applications

Signaling equilibria for dynamic LQG games with. asymmetric information

Sliding Mode Extremum Seeking Control for Linear Quadratic Dynamic Game

BU Macro BU Macro Fall 2008, Lecture 4

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

Mean-square Stability Control for Networked Systems with Stochastic Time Delay

SUFFICIENT CONDITIONS FOR EXISTENCE SOLUTION OF LINEAR TWO-POINT BOUNDARY PROBLEM IN MINIMIZATION OF QUADRATIC FUNCTIONAL

Lecture 20: Riccati Equations and Least Squares Feedback Control

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

Anno accademico 2006/2007. Davide Migliore

Institute for Mathematical Methods in Economics. University of Technology Vienna. Singapore, May Manfred Deistler

Excel-Based Solution Method For The Optimal Policy Of The Hadley And Whittin s Exact Model With Arma Demand

Recursive Estimation and Identification of Time-Varying Long- Term Fading Channels

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

arxiv: v1 [math.ca] 15 Nov 2016

Applications in Industry (Extended) Kalman Filter. Week Date Lecture Title

ACE 562 Fall Lecture 4: Simple Linear Regression Model: Specification and Estimation. by Professor Scott H. Irwin

An introduction to the theory of SDDP algorithm

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

References are appeared in the last slide. Last update: (1393/08/19)

Modeling Economic Time Series with Stochastic Linear Difference Equations

Chapter 8 The Complete Response of RL and RC Circuits

The Optimal Stopping Time for Selling an Asset When It Is Uncertain Whether the Price Process Is Increasing or Decreasing When the Horizon Is Infinite

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Signal and System (Chapter 3. Continuous-Time Systems)

Time series model fitting via Kalman smoothing and EM estimation in TimeModels.jl

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach

12: AUTOREGRESSIVE AND MOVING AVERAGE PROCESSES IN DISCRETE TIME. Σ j =

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

2.160 System Identification, Estimation, and Learning. Lecture Notes No. 8. March 6, 2006

LECTURE 1: GENERALIZED RAY KNIGHT THEOREM FOR FINITE MARKOV CHAINS

Optimal Investment under Dynamic Risk Constraints and Partial Information

6. Stochastic calculus with jump processes

Homogenization of random Hamilton Jacobi Bellman Equations

Cash Flow Valuation Mode Lin Discrete Time

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms

Vehicle Arrival Models : Headway

Optimal Path Planning for Flexible Redundant Robot Manipulators

THE LQG control problem [1] concerns linear systems

Fractional Method of Characteristics for Fractional Partial Differential Equations

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED

1. An introduction to dynamic optimization -- Optimal Control and Dynamic Programming AGEC

10. State Space Methods

Stability and Bifurcation in a Neural Network Model with Two Delays

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Lecture 1 Overview. course mechanics. outline & topics. what is a linear dynamical system? why study linear systems? some examples

GMM - Generalized Method of Moments

RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY

dy dx = xey (a) y(0) = 2 (b) y(1) = 2.5 SOLUTION: See next page

LAPLACE TRANSFORM AND TRANSFER FUNCTION

Online Appendix to Solution Methods for Models with Rare Disasters

Essential Microeconomics : OPTIMAL CONTROL 1. Consider the following class of optimization problems

Introduction to Probability and Statistics Slides 4 Chapter 4

This document was generated at 1:04 PM, 09/10/13 Copyright 2013 Richard T. Woodward. 4. End points and transversality conditions AGEC

Deep Learning: Theory, Techniques & Applications - Recurrent Neural Networks -

R.#W.#Erickson# Department#of#Electrical,#Computer,#and#Energy#Engineering# University#of#Colorado,#Boulder#

( ) = Q 0. ( ) R = R dq. ( t) = I t

A variational radial basis function approximation for diffusion processes.

STATE-SPACE MODELLING. A mass balance across the tank gives:

System of Linear Differential Equations

The Strong Law of Large Numbers

A DELAY-DEPENDENT STABILITY CRITERIA FOR T-S FUZZY SYSTEM WITH TIME-DELAYS

Subway stations energy and air quality management

SEIF, EnKF, EKF SLAM. Pieter Abbeel UC Berkeley EECS

Kalman Bucy filtering equations of forward and backward stochastic systems and applications to recursive optimal control problems

d 1 = c 1 b 2 - b 1 c 2 d 2 = c 1 b 3 - b 1 c 3

Mean-Variance Hedging for General Claims

Chapter 2. First Order Scalar Equations

Anti-Disturbance Control for Multiple Disturbances

Hidden Markov Models. Adapted from. Dr Catherine Sweeney-Reed s slides

1. An introduction to dynamic optimization -- Optimal Control and Dynamic Programming AGEC

Chapter 7 Response of First-order RL and RC Circuits

The expectation value of the field operator.

Structural results for partially nested LQG systems over graphs

Tracking. Announcements

Optimal Investment Strategy Insurance Company

Stable approximations of optimal filters

On a Fractional Stochastic Landau-Ginzburg Equation

Hidden Markov Models

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t

Block Diagram of a DCS in 411

2.4 Cuk converter example

A Bayesian Approach to Spectral Analysis

The Potential Effectiveness of the Detection of Pulsed Signals in the Non-Uniform Sampling

Testing for a Single Factor Model in the Multivariate State Space Framework

not to be republished NCERT MATHEMATICAL MODELLING Appendix 2 A.2.1 Introduction A.2.2 Why Mathematical Modelling?

Examples of Dynamic Programming Problems

CHBE320 LECTURE IV MATHEMATICAL MODELING OF CHEMICAL PROCESS. Professor Dae Ryook Yang

Elements of Stochastic Processes Lecture II Hamid R. Rabiee

The Asymptotic Behavior of Nonoscillatory Solutions of Some Nonlinear Dynamic Equations on Time Scales

Martingales Stopping Time Processes

Transcription:

CONROL OF SOCHASIC SYSEMS P.R. Kumar Deparmen of Elecrical and Compuer Engineering, and Coordinaed Science Laboraory, Universiy of Illinois, Urbana-Champaign, USA. Keywords: Markov chains, ransiion probabiliies, conrolled Markov chains, noisy observaions, parially observed sysems, linear sochasic sysems, linear Gaussian sysems, conrolled auo regressive models, auoregressive moving average sysems wih exogenous inpus, cos funcions, ime-horizon, erminal cos, running cos, saefeedback policy, Markov policy, opimal cos-o-go, dynamic programming equaion, principle of opimaliy, linear quadraic Gaussian problem, equilibrium poins, sabiliy, counable Markov chains, seady-sae, super-maringales, Lyapunov funcions, sopping imes, recurrence, posiive recurrence, sochasic sabiliy, esimaion, Bayes Rules, Kalman filer, minimum mean-square error esimae, condiional mean, condiional covariance, poserior probabiliy disribuion, Bayesian and Non-Bayesian approaches o adapive conrol, parameer vecor, maximum likelihood, leas squares esimae, predicion error esimae, sysem idenificaion, consisency, separaion heorem, cerainy equivalence, informaion sae, hypersae, self-uning regulaors. Conens 1. Inroducion 2. Models of Sochasic Sysems 3. Opimal Sochasic Conrol 4. Sabiliy of Sochasic Sysems 5. Esimaion of Sochasic Sysems 6. Idenificaion and Parameer Esimaion of Sochasic Sysems 7. Conrol of Parially Observed Sysems 8. Adapive Conrol Glossary Bibliography Biographical Skech Summary We presen an accoun of several opics, modeling, conrol, esimaion, sabiliy, idenificaion and adapive conrol, which arise in he sudy of he conrol of sochasic sysems. 1. Inroducion A holisic reamen of he problem of conrol of sochasic sysems encompasses he following opics: (i) Models of sochasic sysems (ii) Opimal sochasic conrol

(iii) Sabiliy of sochasic sysems (iv) Esimaion of sochasic sysems (v) Idenificaion of sochasic sysems (vi) Conrol of parially observed sysems (vii) Adapive conrol We presen an ouline of each of hese opics which will enable he reader o obain an inegraed perspecive of he field. 2. Models of Sochasic Sysems A discree-ime sochasic process { x( )} = 0 is a Markov chain if p( x( + 1) x(0),, x( )) = p( x( + 1) x ( )). ha is, he condiional disribuion of he fuure sae x ( + 1) depends on he pas ( x(0),, x ( )) only hrough he presen sae x (). Indeed his jusifies he use of he name sae. he Markov chain can hen be described by is ransiion probabiliies p( x( + 1) x ( )). Exending his noion, one can describe a conrolled Markov chain by is conrolled ransiion probabiliies p ( x x, u ) which describe he condiional probabiliy of he nex sae x ( + 1) being x, when he curren sae x( ) = x, and an inpu u( ) = u is applied. If he sae x ( ) is no observed, hen i is common o model he observaions y ( ) by he condiional probabiliy disribuion p( yx ) which describes he probabiliy disribuion of he observaion y ( ) when he sae x( ) = x. + he sysem is hen called a parially observed conrolled Markov chain. If he ransiion probabiliies depend on he ime, hen one can describe he imevarying sysem by he pair of ransiion probabiliies p ( x x,u, ) and p( yx, ). A common deerminisic noise-free sae space model of a sysems in discree-ime is x( + 1) = f( x( ), u( ), ) y( ) = g( x( ), ), where x ( ) is he sae of he sysem a ime, u ( ) is he inpu applied a ime, and y () is he oupu a ime. he corresponding sochasic analog of he sae space model is

x( + 1) = f( x( ), u( ), w( ), ) y( ) = g( x( ), v( ) ), where w () is he noise enering he sae equaion, and v () is he noise enering he observaion equaion. hese noises are modeled as sochasic processes (see Models of Sochasic Sysems). If { w(0), w(1), w (2), } are muually independen, hen x ( ) is indeed he sae of a conrolled Markov chain. If furher { v(0), v(1), v (2), } are also muually independen, and w, v are independen of each oher, hen one has a parially observed conrolled Markov chain wrien in he form of sae and observaion equaions. If { w(0), w(1), w (2), } are no muually independen, hen one ofen models hem as he oupu of a sysem driven by independen random variables { n(0), n (1),, m(0), m(1), } z( + 1) = h( z( ), n( )) w( ) = k( z( ), m( )). In such siuaions, one can adjoin z o x and le ( x,z ) serve as he sae. A special and imporan case of such a sae space model is a linear sochasic sysem: x( + 1) = A( ) x( ) + B( ) u( ) + G( ) w( ) y() = C() x() + H() v(), where A(), B(), C(), G() and H () are ime-varying marices of appropriae dimensions. A model ha paricularly lends iself o analysis is when he noise processes w () and v () are joinly Gaussian sochasic processes (see Models of Sochasic Sysems). hen i is called a Linear-Gaussian model. Insead of dealing wih he sae x (), one can direcly model how he inpu influences he oupu, i.e., by an inpu-oupu model. he mos common model is a Conrol Auoregressive Moving Average Model (CARMA) or Auoregressive Moving Average Model wih Exogenous Inpus (ARMAX) model: y() + a1y( 1) + + any( n) = b0u() + b1u( 1) + + bu ( n) + w( ) + cw( 1) + + cw( n). n One can also consider he coninuous ime counerpar of he sae-space model: dx( ) = f( x( ), u( ), ) d+σ ( x( )) dw( ) dy() = g( x()) d+ dv(). 1 n

Here w ( ) and v ( ) are Brownian moion processes, and one has o inerpre he above sochasic differenial equaions in he appropriae mahemaical way. his requires a knowledge of Io sochasic inegrals and sochasic calculus. 3. Opimal Sochasic Conrol Consider he case of a discree-ime sochasic sysem where he sae x ( ) is direcly observed. How should one choose he conrol inpu { u ( )} o be applied o such a sysem? A common approach is o consider a cos funcion of he form E hx ( ( + 1)) + c( x( ), u ( )), = 0 and choose conrol inpus which minimize his expeced cos. Above is a ime horizon, hx ( ( + 1)) is he erminal cos, and c( x( ), u ( )) is he running cos. One minimizes his cos over he se of hisory dependen sraegies where u() = u( x(0),, x (),) is allowed o depend on he enire pas of he observaions and he curren ime. I can be shown ha wihin his class of hisory dependen sraegies one can resric aenion o sraegies of he form u() = u( x (),) where he inpu depends only on he curren sae and curren ime. Such a sraegy can be ermed as a sae feedback policy or a Markov policy. If one defines he opimal remaining cos or opimal cos-o-go from a sae x a ime by V( x, ): = Min u() E h( x( + 1)) + c( x( s), u( s)) x( ) = x, s= hen i can be shown ha his funcion saisfies he following equaion: V( x, ) = Min u c( ) + ( ) V ( + 1), x,u p x x, u x, x wih he erminal condiion V ( x, + 1) = h( x ). Essenially he above equaion says ha he opimal cos from a sae x a ime is obained by considering differen choices of an inpu u o apply a ime. For each such poenial inpu u, one deermines he curren cos c( x,u ) as well as he expeced cos from he sae reached a he nex ime insan. hen, one simply chooses he bes inpu o apply a he presen ime as he one which minimizes he sum of he expeced curren cos plus he expeced remaining cos. his equaion is called he dynamic programming equaion, and he logic leading o i as he principle of opimaliy. I also

follows ha if for ( x, ) one chooses he minimizing u, calling i ux, ( ), hen ux, ( ) is he opimal policy. hus he opimal policy can be chosen as a Markov or a sae feedback policy. he dynamic programming approach can be exended o oher models and siuaions, as shown in Dynamic Programming. A paricular special case of grea ineres in conrol is he so-called Linear-Quadraic- Gaussian (LQG) problem. For a linear sysem wih independen whie Gaussian noises w and v, x( + 1) = A( ) x( ) + B( ) u( ) + G( ) w( ) y() = C() x() + H() v(), one seeks o minimize a quadraic cos crierion: E x ( + 1) Sx( + 1) + ( x ( ) Q( ) x ( ) + u ( ) R( ) u ( )), = 1 where S 0, Q( ) 0 and R ( ) >0. he cos-o-go funcion urns ou o be quadraic funcion of he sae plus a deerminisic erm: V( x, ) = x S() x + γ (). By subsiuing his form in he dynamic programming equaion, one can solve for S ( ) and γ ( ) in erms of S+ ( 1) and γ ( + 1) (remember ha dynamic programming solves he problem backwards in ime). Wih he boundary condiions S ( + 1) = S and γ ( + 1) = 0, one hus obains recursions for S ( ) and γ ( ). From he minimizing argumen in he dynamic programming equaion one also deermines ha he opimal conrol law is of he form u() = K() x (), i.e., linear ime varying feedback, wih K ( ) expressible in erms of S. ( ) he LQG problem hus admis a clean soluion. he deails of he soluion are given in LQsochasic Conrol. Given ha he quadraic cos funcion is a reasonable crierion, and given he widespread usage of linear models, his soluion has proved o be eminenly useful in conrol sysem design. In many siuaions of ineres, e.g., in adapive conrol and self-uning regulaors, see Self-uning Conrol, one wishes o work wih inpu-oupu models wih quadraic coss. his is deal wih in Minimum Variance Conrol.

- - - O ACCESS ALL HE 18 PAGES OF HIS CHAPER, Click here Bibliography B. Anderson and J.B. Moore, Opimal Filering. Englewood Cliffs, NJ: Prenice-Hall, 1979. [Conains a comprehensive reamen of esimaion for linear sochasic sysems]. D.P. Bersekas, Dynamic Programming: Deerminisic and Sochasic Models. Englewood Cliffs, NJ: Prenice-Hall, 1987. [A comprehensive reamen of discree-ime dynamic programming]. G.C. Goodwin and K.S. Sin, Adapive Filering, Predicion and conrol. Englewood Cliffs, NJ: Prenice- Hall, 1984. [Conains a reamen of discree-ime modeling of linear sysems, as well as idenificaion and adapive conrol]. P.R. Kumar and P.P. Varaiya, Sochasic Sysems: Esimaion, Idenificaion and Adapive Conrol. Englewood Cliffs, NJ: Prenice-Hall, 1986. [Conains a concise reamen of several opics including modeling, conrol, esimaion, idenificaion, and adapaion]. H. Kushner, Inroducion o Sochasic Conrol. Hold, Rinehar and Winson, 1971. [Conains a reamen of sochasic conrol as well as sochasic sabiliy]. L.Ljung and. Södersrom, heory and Pracice of Recursive Idenificaion. Cambridge, MA: MI Press, 1983. [Conains a reamen of idenificaion and parameer esimaion for linear sysems]. Biographical Skech P.R. Kumar is he Franklin W. Woelge Professor of Elecrical and Compuer Engineering, and a Research Professor in he Coordinaed Science Laboraory, a he Universiy of Illinois, Urbana- Champaign. He was he recipien of he Donald P. Eckman Award of he American Auomaic Conrol Council. He has presened plenary lecures a he SIAM Annual Meeing and he SIAM Conrol Conference in 2001, he IEEE Conference on Decision and Conrol in San Anonio, exas, 1993, he SIAM Conference on Opimizaion in Chicago, 1992, he SIAM Annual Meeing a San Diego, 1994, Brazilian Auomaic Conrol Congress, and he hird Annual Semiconducor Manufacuring, Conrol and Opimizaion Workshop. He is co-auhor wih Pravin Varaiya of he book, Sochasic Sysems: Esimaion, Idenificaion and Adapive Conrol. He serves on he ediorial boards of Communicaions in Informaion and Sysems, Journal of Discree Even Dynamic Sysems; Mahemaics of Conrol Signals and Sysems; Mahemaical Problems in Engineering: Problems, heories and Applicaions; and in he pas has served as Associae Edior a Large of IEEE ransacions on Auomaic Conrol; Associae Edior of SIAM Journal on Conrol and Opimizaion; Sysems and Conrol Leers; Journal of Adapive Conrol and Signal Processing; and he IEEE ransacions on Auomaic Conrol. He is a Fellow of IEEE. Professor Kumar s curren research ineress are in wireless neworks, disribued real-ime sysems wafer fabricaion plans, and machine learning.