Particle Filtering Approaches for Dynamic Stochastic Optimization

Particle Filtering Approaches for Dynamic Stochastic Optimization John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge I-Sim Workshop, Purdue U., July 2015 1

Themes Dynamic stochastic optimization models face the curse of dimensionality Randomization can overcome some dimensionality effects With sufficient symmetry, the results can be quite effective Keys are conditioning properly on outcomes, appropriate re-sampling and split sampling, and uses of Gaussian mixtures JRBirge I-Sim Workshop, Purdue U., July 2015 2

Basic problem Outline Problems of estimation Particle filters and sequential methods for simulation Optimization by slice sampling Dynamic Gaussian problems Extensions to Gaussian mixtures JRBirge I-Sim Workshop, Purdue U., July 2015 3

General Problem: Find a set of controls, u t ; t = 0; : : : ; T, to T min E[ X 1 u t 2U t \A t ;x t+1 =g t (x t ;u t ) t=0 f t (x t+1 ; u t )] where A t is t -measurable (nonanticipative), U t represents the possibility of additional constraints on the controls, g t represents state dynamics, f t represents the cost of the current action and next state transition. Challenges: Rapid expansion of states in dimension of x t in each period, Exponential growth of trees over time. JRBirge I-Sim Workshop, Purdue U., July 2015 4

First Problem: What is x t? Example: hidden Markov model Example: We begin with inventory x 0 (which may be uncertain) We make observations y 1,,y T based on reported sales and new inputs What is the distribution of actual inventory x t at any time t? JRBirge I-Sim Workshop, Purdue U., July 2015 5

Bayesian Interpretation Suppose a distribution on x 0 : f(x 0 ) Assume some transition: p(x t x t-1 ) Observations: q(y t x t ) Goal: find P(x t y 1,,y t ( y T )) Idea: Use Bayes Rule to find the conditional probability: P(x t y t )=P(x t,y t )/P(y t ) / s q(y s x s )p(x s x s-1 )f(x 0 ) JRBirge I-Sim Workshop, Purdue U., July 2015 6

Particle Methods Idea: Keep a limited number (N) of particles in each period Propagate forward to obtain new points Re-sample (re-weight) backwards to get consistent weights Re-balance if weights are low References: Gordon, Salmond, Smith (1993), Doucet (2008) JRBirge I-Sim Workshop, Purdue U., July 2015 7

General Processes If the observation and transition functions, q and p are quite general, then P(x t y t ) may not be available analytically (E.g., no sales if stocked out, maximum limit, bulky demand) Alternatives: Markov chain Monte Carlo (MCMC) Particle methods JRBirge I-Sim Workshop, Purdue U., July 2015 8

Sequential Importance Resampling (SIR) Method 1. Propagate. Draw x t+1;j ; j = 1; : : : ; N from f(x t+1;j jx t;j ). 2. Re-weight. Update weights, ^w t;j ; j = 1; : : : ; N, to ^w t+1;j = for j = 1; : : : ; N. w t;j g(y t+1 jx t+1 ) P N j=1 ^w t;jg(y t+1 jx t+1 ) 3. Re-sample. If a re-sampling condition is satis ed, draw N new samples x 0 t+1;j ; j = 1; : : : ; N from x t+1;j; j = 1; : : : ; N with probability distribution given by ^w t+1;j ; j = 1; : : : ; N; let x t+1;j = x 0 t+1;j and ^w t+1;j = 1=N for j = 1; : : : ; N. JRBirge I-Sim Workshop, Purdue U., July 2015 9

Tractable Case: LQG Linear, quadratic Gaussian (LQG) Model Implications: All of the distributions are normal Can derive analytical formulas (Kalman filter): y t = H x t + v t ; x t =F x t-1 + u t ; v~n(0,q); u~n(o,r) Iterations: w t = y t -Hx t ; S=HP H T +R; K=P HTS -1 ; x t =x t +Kw t P =(I-KH)P ; x t =Fx t-1 ; P =FP F T +Q JRBirge RMS Phoenix, October 2012 10

Inventory over 10 periods: - Actual (*) - Observed (+) Kalman estimate: x t ~ N(x t,p t ) Here: x 10 ~ N(13.8,0.62) Kalman Prediction JRBirge I-Sim Workshop, Purdue U., July 2015 11

Comparison to MCMC/Particle Filter Idea: construct a sample distribution that fits the data using only information about one-step transitions Goal: maintain a constant number of samples across time instead of explosively growing tree Can it work? Why? JRBirge I-Sim Workshop, Purdue U., July 2015 12

Propagate w t1, x 1 t w t2, x 2 t... w tn, x N t Method Structure Re-sample: x 1 w 1 t t+1 x 2 w 2 t t+1 y t+1...... x N w N t t+1 JRBirge I-Sim Workshop, Purdue U., July 2015 13

Cum. Results N=50 Compare to Kalman Cumulative JRBirge I-Sim Workshop, Purdue U., July 2015 14

Particle Frequency N=50 Particle (+) Kalman (*) JRBirge I-Sim Workshop, Purdue U., July 2015 15

Other Uses of Particles Particle filtering: finding x t from y t =(y 1,,y t ) Particle smoothing: finding x t from y T =(y 1,,y T ) Particle learning: (Polson et al.) Find parameters µ t at the same time to explain the model. Simulation JRBirge I-Sim Workshop, Purdue U., July 2015 16

Extensions to Optimization Problem: nd x to Pincus (1968): enough to sample from: min x f(x)subject to g(x) 0 ¼ (x) / expf (f(x))g with constraints (e.g., Geman/Geman (1985)) : ¼ ; (x) / exp f (f(x) + g(x))g Extension: Augment with latent variables ( ;!) then we can write ¼ ; (x) / expf (f(x) + g(x))g = e ax E! fe x2 2! ge bx E fe x2 2 g where the parameters (a; b) depend on ( ; ) and the functional forms of (f(x); g(x)). JRBirge I-Sim Workshop, Purdue U., July 2015 17

Basic Optimization via Simulation Pincus results: Model: min F(x) Create a distribution P (x)/ exp(- F(x)) Then: lim 1 E P (x) = x * JRBirge I-Sim Workshop, Purdue U., July 2015 18

Markov Chain Simulation From x t to x t+1 Sample over exp(- F(¹ x t +(1-¹) x t+1 ) ) Choose directions to obtain mixing in the region JRBirge I-Sim Workshop, Purdue U., July 2015 19

Example Rosenbrock function: Á(x = (x 1 ; x 2 )) = (1 x 1 ) 2 + 100(x 2 x 2 1) 2 ; which has a global minimum at x = (1; 1). Speci cation for f: x t = fu if v e (Á(u)=Á(x t 1)) ; x t 1 otherwise; g where u» U([ 4; 4] 2 ), a uniform distribution over [ 4; 4] [ 4; 4], and v» U([0; 1]), a standard uniform draw. Note: corresponds to the Metropolis-Hastings method of Markov Chain Monte Carlo using the distribution proportional to e Á(x) for the acceptance criterion. JRBirge I-Sim Workshop, Purdue U., July 2015 20

Test for Rosenbrock T=5000 iterations N=500 replications Re-sample when: Effective Sample Size<250 =2, 10, 100 JRBirge I-Sim Workshop, Purdue U., July 2015 21

Rosenbrock Graphs = 2 10 100 JRBirge I-Sim Workshop, Purdue U., July 2015 22

Other Examples: Rastigrin JRBirge I-Sim Workshop, Purdue U., July 2015 23

Himmelblau JRBirge I-Sim Workshop, Purdue U., July 2015 24

Shubert JRBirge I-Sim Workshop, Purdue U., July 2015 25

Combine Particles and Simulation Suppose N particles are used Consider a 2-stage problem f(x,») = c(x) + Q(x,y,») Distribution on x, y i,»: P(x,y 1, y N,»)/ i=1 N (c(x)+q(x,y i,» i )) Result: E P (x) x * JRBirge I-Sim Workshop, Purdue U., July 2015 26

Implementation Propagation forward on y s: q(» j x 1 ) / { c(x) + Q(» j, y j, x)} p(» j ) where y j =argmin(c(x) + Q(» j, y j, x)) p( x» 1,y 1,,» N y N ) / j=1 N {c(x) + Q(» j, y j, x) }p(» j ) The particles concentrate the distribution on x* as in the deterministic optimization. JRBirge I-Sim Workshop, Purdue U., July 2015 27

Example From B/Louveaux, Chapter 1 (Farmer): JRBirge I-Sim Workshop, Purdue U., July 2015 28

Dynamic Models General Idea: Create a distribution over decision variables and random variables Use MCMC to sample with sufficient number of particles or annealing parameter marginal mean on optimum Extension for dynamics: Keep constant sample particle numbers Convergence in fixed number of particles per stage? JRBirge I-Sim Workshop, Purdue U., July 2015 29

Example: Linear-Quadratic Gaussian All distributions are available in closed form x t =Ax t-1 +B t u t-1 +v t All normal distributions with risk-sensitive objective: -log E(exp(-(µ/2)(x 2T Qx 2 +u 1T Ru 1 ))) where u 1 is control in the first period Note: Equivalent to robust formulation Now, all can be found in closed form JRBirge I-Sim Workshop, Purdue U., July 2015 30

Results for LQG Mean of u: m u 1 m u 1 = L 1(µ) x 1, where L 1 is the result from the Kalman filter Variance of u: V u (N) where (N) 0 as N 1 V u 1 m u 1 (µ) u 1 * as µ 0 where u 1 * is the two-stage risk-neutral solution. JRBirge I-Sim Workshop, Purdue U., July 2015 31

Extensions to T Stages Can derive S t µ recursively to show that: p(x t x t-1,u t-1 )=K exp(-(1/2) ( s=1 t-1 (x tt Q s x s +u st R s u s )+x tt S t µ x t ) + (x t -Ax t-1 -Bu t-1 ) T V t-1 (x t -Ax t-1 -Bu t-1 )) For particle methods, j=1 N: Each x t j drawn consistently from this distribution JRBirge I-Sim Workshop, Purdue U., July 2015 32

w t1, x t 1 u t 1 w t2, x t 2. u t 2 w tn, x t N u t N General Method Structure Propogate Re-sample: x 1 t+1 w t1, x 1 t x 2 t+1 u 1 t... x t+1 N w t2, x t 2 u t 2 JRBirge I-Sim Workshop, Purdue U., July 2015 33. w tn, x t N u t N

General Result If the propagation step is unbiased, then E(u 1 ) u 1 * The unbiased result is possible in LQG from the recursive structure. Harder to obtain in more general systems (without additional future sampling) JRBirge I-Sim Workshop, Purdue U., July 2015 34

Two-Stage: Compare Direct MC Low Risk Aversion High Risk Aversion JRBirge I-Sim Workshop, Purdue U., July 2015 35

Three Stage Example JRBirge I-Sim Workshop, Purdue U., July 2015 36

Generalizing Objectives Direct: risk-sensitive exponential utility: exp(-µ Q(x,y,»)) - Q quadratic Objective: e -µ y - d E [exp((-µ 2 /2 j)(y j d) 2 )] where ~ Exp(2) Also, use: E[ e -µ y - d ]¼ 1 + µ y-d + O(1) JRBirge I-Sim Workshop, Purdue U., July 2015 37

Further Generalizations Use objective approximations that preserve normal forms Extend LQG-like results to this general framework Find other frameworks where unbiased future samples are available Obtain sensitivity results on bias from imperfect future sampling Use optimization to pick ML instead JRBirge I-Sim Workshop, Purdue U., July 2015 38

Conclusions Techniques from MCMC can be used effectively for optimizing to reduce dimension effects Results work well in situations with sufficient symmetry Keys are to effectively use re-sampling, split (and slice) sampling, and latent variables JRBirge I-Sim Workshop, Purdue U., July 2015 39

Other Questions to Consider How can learning and parameter estimation be combined with optimization? What is the range of problems that can be approached in this manner? What are the best uses of optimization technology as part of this process? JRBirge I-Sim Workshop, Purdue U., July 2015 40

Thank you! Questions? JRBirge I-Sim Workshop, Purdue U., July 2015 41