Particle Filtering Approaches for Dynamic Stochastic Optimization

Size: px

Start display at page:

Download "Particle Filtering Approaches for Dynamic Stochastic Optimization"

Martha Burns
5 years ago
Views:

1 Particle Filtering Approaches for Dynamic Stochastic Optimization John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge I-Sim Workshop, Purdue U., July

2 Themes Dynamic stochastic optimization models face the curse of dimensionality Randomization can overcome some dimensionality effects With sufficient symmetry, the results can be quite effective Keys are conditioning properly on outcomes, appropriate re-sampling and split sampling, and uses of Gaussian mixtures JRBirge I-Sim Workshop, Purdue U., July

3 Basic problem Outline Problems of estimation Particle filters and sequential methods for simulation Optimization by slice sampling Dynamic Gaussian problems Extensions to Gaussian mixtures JRBirge I-Sim Workshop, Purdue U., July

4 General Problem: Find a set of controls, u t ; t = 0; : : : ; T, to T min E[ X 1 u t 2U t \A t ;x t+1 =g t (x t ;u t ) t=0 f t (x t+1 ; u t )] where A t is t -measurable (nonanticipative), U t represents the possibility of additional constraints on the controls, g t represents state dynamics, f t represents the cost of the current action and next state transition. Challenges: Rapid expansion of states in dimension of x t in each period, Exponential growth of trees over time. JRBirge I-Sim Workshop, Purdue U., July

5 First Problem: What is x t? Example: hidden Markov model Example: We begin with inventory x 0 (which may be uncertain) We make observations y 1,,y T based on reported sales and new inputs What is the distribution of actual inventory x t at any time t? JRBirge I-Sim Workshop, Purdue U., July

6 Bayesian Interpretation Suppose a distribution on x 0 : f(x 0 ) Assume some transition: p(x t x t-1 ) Observations: q(y t x t ) Goal: find P(x t y 1,,y t ( y T )) Idea: Use Bayes Rule to find the conditional probability: P(x t y t )=P(x t,y t )/P(y t ) / s q(y s x s )p(x s x s-1 )f(x 0 ) JRBirge I-Sim Workshop, Purdue U., July

7 Particle Methods Idea: Keep a limited number (N) of particles in each period Propagate forward to obtain new points Re-sample (re-weight) backwards to get consistent weights Re-balance if weights are low References: Gordon, Salmond, Smith (1993), Doucet (2008) JRBirge I-Sim Workshop, Purdue U., July

8 General Processes If the observation and transition functions, q and p are quite general, then P(x t y t ) may not be available analytically (E.g., no sales if stocked out, maximum limit, bulky demand) Alternatives: Markov chain Monte Carlo (MCMC) Particle methods JRBirge I-Sim Workshop, Purdue U., July

9 Sequential Importance Resampling (SIR) Method 1. Propagate. Draw x t+1;j ; j = 1; : : : ; N from f(x t+1;j jx t;j ). 2. Re-weight. Update weights, ^w t;j ; j = 1; : : : ; N, to ^w t+1;j = for j = 1; : : : ; N. w t;j g(y t+1 jx t+1 ) P N j=1 ^w t;jg(y t+1 jx t+1 ) 3. Re-sample. If a re-sampling condition is satis ed, draw N new samples x 0 t+1;j ; j = 1; : : : ; N from x t+1;j; j = 1; : : : ; N with probability distribution given by ^w t+1;j ; j = 1; : : : ; N; let x t+1;j = x 0 t+1;j and ^w t+1;j = 1=N for j = 1; : : : ; N. JRBirge I-Sim Workshop, Purdue U., July

10 Tractable Case: LQG Linear, quadratic Gaussian (LQG) Model Implications: All of the distributions are normal Can derive analytical formulas (Kalman filter): y t = H x t + v t ; x t =F x t-1 + u t ; v~n(0,q); u~n(o,r) Iterations: w t = y t -Hx t ; S=HP H T +R; K=P HTS -1 ; x t =x t +Kw t P =(I-KH)P ; x t =Fx t-1 ; P =FP F T +Q JRBirge RMS Phoenix, October

11 Inventory over 10 periods: - Actual (*) - Observed (+) Kalman estimate: x t ~ N(x t,p t ) Here: x 10 ~ N(13.8,0.62) Kalman Prediction JRBirge I-Sim Workshop, Purdue U., July

12 Comparison to MCMC/Particle Filter Idea: construct a sample distribution that fits the data using only information about one-step transitions Goal: maintain a constant number of samples across time instead of explosively growing tree Can it work? Why? JRBirge I-Sim Workshop, Purdue U., July

13 Propagate w t1, x 1 t w t2, x 2 t... w tn, x N t Method Structure Re-sample: x 1 w 1 t t+1 x 2 w 2 t t+1 y t x N w N t t+1 JRBirge I-Sim Workshop, Purdue U., July

14 Cum. Results N=50 Compare to Kalman Cumulative JRBirge I-Sim Workshop, Purdue U., July

15 Particle Frequency N=50 Particle (+) Kalman (*) JRBirge I-Sim Workshop, Purdue U., July

16 Other Uses of Particles Particle filtering: finding x t from y t =(y 1,,y t ) Particle smoothing: finding x t from y T =(y 1,,y T ) Particle learning: (Polson et al.) Find parameters µ t at the same time to explain the model. Simulation JRBirge I-Sim Workshop, Purdue U., July

17 Extensions to Optimization Problem: nd x to Pincus (1968): enough to sample from: min x f(x)subject to g(x) 0 ¼ (x) / expf (f(x))g with constraints (e.g., Geman/Geman (1985)) : ¼ ; (x) / exp f (f(x) + g(x))g Extension: Augment with latent variables ( ;!) then we can write ¼ ; (x) / expf (f(x) + g(x))g = e ax E! fe x2 2! ge bx E fe x2 2 g where the parameters (a; b) depend on ( ; ) and the functional forms of (f(x); g(x)). JRBirge I-Sim Workshop, Purdue U., July

18 Basic Optimization via Simulation Pincus results: Model: min F(x) Create a distribution P (x)/ exp(- F(x)) Then: lim 1 E P (x) = x * JRBirge I-Sim Workshop, Purdue U., July

19 Markov Chain Simulation From x t to x t+1 Sample over exp(- F(¹ x t +(1-¹) x t+1 ) ) Choose directions to obtain mixing in the region JRBirge I-Sim Workshop, Purdue U., July

20 Example Rosenbrock function: Á(x = (x 1 ; x 2 )) = (1 x 1 ) (x 2 x 2 1) 2 ; which has a global minimum at x = (1; 1). Speci cation for f: x t = fu if v e (Á(u)=Á(x t 1)) ; x t 1 otherwise; g where u» U([ 4; 4] 2 ), a uniform distribution over [ 4; 4] [ 4; 4], and v» U([0; 1]), a standard uniform draw. Note: corresponds to the Metropolis-Hastings method of Markov Chain Monte Carlo using the distribution proportional to e Á(x) for the acceptance criterion. JRBirge I-Sim Workshop, Purdue U., July

21 Test for Rosenbrock T=5000 iterations N=500 replications Re-sample when: Effective Sample Size<250 =2, 10, 100 JRBirge I-Sim Workshop, Purdue U., July

22 Rosenbrock Graphs = JRBirge I-Sim Workshop, Purdue U., July

23 Other Examples: Rastigrin JRBirge I-Sim Workshop, Purdue U., July

24 Himmelblau JRBirge I-Sim Workshop, Purdue U., July

25 Shubert JRBirge I-Sim Workshop, Purdue U., July

26 Combine Particles and Simulation Suppose N particles are used Consider a 2-stage problem f(x,») = c(x) + Q(x,y,») Distribution on x, y i,»: P(x,y 1, y N,»)/ i=1 N (c(x)+q(x,y i,» i )) Result: E P (x) x * JRBirge I-Sim Workshop, Purdue U., July

27 Implementation Propagation forward on y s: q(» j x 1 ) / { c(x) + Q(» j, y j, x)} p(» j ) where y j =argmin(c(x) + Q(» j, y j, x)) p( x» 1,y 1,,» N y N ) / j=1 N {c(x) + Q(» j, y j, x) }p(» j ) The particles concentrate the distribution on x* as in the deterministic optimization. JRBirge I-Sim Workshop, Purdue U., July

28 Example From B/Louveaux, Chapter 1 (Farmer): JRBirge I-Sim Workshop, Purdue U., July

29 Dynamic Models General Idea: Create a distribution over decision variables and random variables Use MCMC to sample with sufficient number of particles or annealing parameter marginal mean on optimum Extension for dynamics: Keep constant sample particle numbers Convergence in fixed number of particles per stage? JRBirge I-Sim Workshop, Purdue U., July

30 Example: Linear-Quadratic Gaussian All distributions are available in closed form x t =Ax t-1 +B t u t-1 +v t All normal distributions with risk-sensitive objective: -log E(exp(-(µ/2)(x 2T Qx 2 +u 1T Ru 1 ))) where u 1 is control in the first period Note: Equivalent to robust formulation Now, all can be found in closed form JRBirge I-Sim Workshop, Purdue U., July

31 Results for LQG Mean of u: m u 1 m u 1 = L 1(µ) x 1, where L 1 is the result from the Kalman filter Variance of u: V u (N) where (N) 0 as N 1 V u 1 m u 1 (µ) u 1 * as µ 0 where u 1 * is the two-stage risk-neutral solution. JRBirge I-Sim Workshop, Purdue U., July

32 Extensions to T Stages Can derive S t µ recursively to show that: p(x t x t-1,u t-1 )=K exp(-(1/2) ( s=1 t-1 (x tt Q s x s +u st R s u s )+x tt S t µ x t ) + (x t -Ax t-1 -Bu t-1 ) T V t-1 (x t -Ax t-1 -Bu t-1 )) For particle methods, j=1 N: Each x t j drawn consistently from this distribution JRBirge I-Sim Workshop, Purdue U., July

33 w t1, x t 1 u t 1 w t2, x t 2. u t 2 w tn, x t N u t N General Method Structure Propogate Re-sample: x 1 t+1 w t1, x 1 t x 2 t+1 u 1 t... x t+1 N w t2, x t 2 u t 2 JRBirge I-Sim Workshop, Purdue U., July w tn, x t N u t N

34 General Result If the propagation step is unbiased, then E(u 1 ) u 1 * The unbiased result is possible in LQG from the recursive structure. Harder to obtain in more general systems (without additional future sampling) JRBirge I-Sim Workshop, Purdue U., July

35 Two-Stage: Compare Direct MC Low Risk Aversion High Risk Aversion JRBirge I-Sim Workshop, Purdue U., July

36 Three Stage Example JRBirge I-Sim Workshop, Purdue U., July

37 Generalizing Objectives Direct: risk-sensitive exponential utility: exp(-µ Q(x,y,»)) - Q quadratic Objective: e -µ y - d E [exp((-µ 2 /2 j)(y j d) 2 )] where ~ Exp(2) Also, use: E[ e -µ y - d ]¼ 1 + µ y-d + O(1) JRBirge I-Sim Workshop, Purdue U., July

38 Further Generalizations Use objective approximations that preserve normal forms Extend LQG-like results to this general framework Find other frameworks where unbiased future samples are available Obtain sensitivity results on bias from imperfect future sampling Use optimization to pick ML instead JRBirge I-Sim Workshop, Purdue U., July

39 Conclusions Techniques from MCMC can be used effectively for optimizing to reduce dimension effects Results work well in situations with sufficient symmetry Keys are to effectively use re-sampling, split (and slice) sampling, and latent variables JRBirge I-Sim Workshop, Purdue U., July

40 Other Questions to Consider How can learning and parameter estimation be combined with optimization? What is the range of problems that can be approached in this manner? What are the best uses of optimization technology as part of this process? JRBirge I-Sim Workshop, Purdue U., July

41 Thank you! Questions? JRBirge I-Sim Workshop, Purdue U., July

Markov Chain Monte Carlo Methods for Stochastic Optimization

Markov Chain Monte Carlo Methods for Stochastic Optimization John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge U of Toronto, MIE,