Robust Monte Carlo Methods for Sequential Planning and Decision Making

Size: px

Start display at page:

Download "Robust Monte Carlo Methods for Sequential Planning and Decision Making"

Marcia Lawson
5 years ago
Views:

1 Robust Monte Carlo Methods for Sequential Planning and Decision Making Sue Zheng, Jason Pacheco, & John Fisher Sensing, Learning, & Inference Group Computer Science & Artificial Intelligence Laboratory Massachusetts Institute of Technology November 30, 2017

2 Diversion Detection Potential diversion points [Image: Given sensor characteristics, observations, and ideal network, infer deviations (e.g., unknown network of material diversion).

3 Active Sensing and Decision Making Announced site visit Satellite/Flyby EO/SAR/IR [Image: Announce site inspections to learn causal network structure. Modify sensor configurations to reduce uncertainty of inferences.

4 Sequential Experiment Design Experimental choice is a form of planning. [Image: Liepe et al., 2013]

5 Goal Develop algorithm for sequential Bayesian inference and planning with the following desiderata, Information theoretic approach to planning Effective even for complex models Provable theoretical guarantees Maximally reuses computation in inference and planning phases

Probabilistic Model & Inference [Image: http://isis-online.

6 Probabilistic Model & Inference [Image: } {{ } Unknown x } {{ } Observation y Joint probability model: p(x, y) = p(x)p(y x) Prior belief in diversion network Likelihood of observations given structure Posterior belief in network structure given observations: p(x y) = p(x)p(y x) p(y)

7 Decision-Driven Observations Unknown Decision Observation d = 1 Y X p(y X; d = 1) X d = 2 Y X p(y X; d = 2).. d = D Y X p(y X; d = D) Configuration variable d = {1,..., D} controls observation model. E.g., announcing inspection of site A affects sensor observations. Choose configuration to reduce posterior uncertainty

8 Entropy H(X) = E[ log p(x)] Encodes uncertainty, more reliable than variance in many cases (multimodality) Coin flip example: X Bernoulli(p)

9 Mutual Information What would be uncertainty of X if we knew Y? I(X; Y ) = H(X) H(X Y ) Prior uncertainty Expected posterior uncertainty I will drop explicit dependence on configuration variable d when clear from context: I(X; Y d) I(X; Y )

10 Information Theoretic Planning d = arg max d I(X; Y d) Execute Plan y p( x; d) Execute: Draw new observation Y = y Inference: Update posterior belief p(x y) p(x)p(y X) Planning: Choose most informative decision d = arg max I(X; Y d) d

11 Information Theoretic Planning d = arg max d I(X; Y d) Execute Plan Inference y p( x; d) p(x y; d) Execute: Draw new observation Y = y Inference: Update posterior belief p(x y) p(x)p(y X) Planning: Choose most informative decision d = arg max I(X; Y d) d

12 Information Theoretic Planning d = arg max d I(X; Y d) Execute Plan Inference y p( x; d) p(x y; d) Planning Execute: Draw new observation Y = y Inference: Update posterior belief p(x y) p(x)p(y X) Planning: Choose most informative decision d = arg max I(X; Y d) d

13 Closed-Loop Greedy Planning X Unknown quantity: Observation sequence: Y 1 Y 2 Y 3 Y T Decision sequence: d 1 d 2 d 3 d T Condition on observed information during planning d greedy t = arg max I(X; Y t y1 t 1 ; d) d

14 Closed-Loop Greedy Planning X Unknown quantity: Observation sequence: Y 1 Y 2 Y 3 Y T Decision sequence: d 1 d 2 d 3 d T Condition on observed information during planning d greedy t = arg max I(X; Y t y1 t 1 ; d) d

15 Closed-Loop Greedy Planning X Unknown quantity: Observation sequence: y 1 Y 2 Y 3 Y T Decision sequence: d 1 d 2 d 3 d T Condition on observed information during planning d greedy t = arg max I(X; Y t y1 t 1 ; d) d

16 Closed-Loop Greedy Planning X Unknown quantity: Observation sequence: y 1 y 2 Y 3 Y T Decision sequence: d 1 d 2 d 3 d T Condition on observed information during planning d greedy t = arg max I(X; Y t y1 t 1 ; d) d

17 Closed-Loop Greedy Planning X Unknown quantity: Observation sequence: y 1 y 2 y 3 Y T Decision sequence: d 1 d 2 d 3 d T Condition on observed information during planning d greedy t = arg max I(X; Y t y1 t 1 ; d) d

18 Estimating Mutual Information Mutual information typically lacks closed-form: [ ] p(x, y) I(X; Y ) = E log p(x)p(y) Mutual Information Evidence Evidence integrates every latent configuration: p(y) = p(x, y)dx Can use Monte Carlo integration to estimate integrals from joint samples: {x i, y i } N i=1 p(x, y) Empirical estimate is sensitive to outliers in small sample regime

19 Robust Estimation of Information Absolute Error Robust upper Empirical upper/lower [Images: Catoni, 2010] ɛ (Confidence Level = 1 2ɛ) Robust M-estimator is solution to root equation i ψ(α(θ i ˆθ)) = 0 where θ i = log p(x i,y i ) p(x i )p(y i ) Influence function ψ reduces impact of outliers Provides quality guarantee in finite sample setting

20 Integrated Inference & Planning 3 MCMC Inference 1 2 Robust Evidence Estimate Planning Robust MI Estimation At time t execute plan and draw observation: y t p(y x; d greedy t ) Do posterior inference via MCMC samples {x i } N i=1 p(x y t 1)

21 Integrated Inference & Planning 3 MCMC Inference 1 2 Robust Evidence Estimate Planning Robust MI Estimation For each decision draw samples: {x i } N i=1 p(x Y t+1, y1; t d) Robust estimation of model evidence: ˆp(Y t+1 y1; t d) Can reuse MCMC samples with importance sampling

22 Integrated Inference & Planning 3 MCMC Inference 1 2 Robust Evidence Estimate Planning Robust MI Estimation For each decision robust MI estimate: Î(X; Y t+1 y1; t d) Greedy planning: d greedy t+1 = arg max Î(X; Y t+1 y1; t d) d

23 Sequential Inference & Planning MCMC Robust Evidence Estimate Robust MI Estimation Run MCMC only when necessary Avoid additional samples during planning Significantly reduces computation in inference and planning stages through sample reuse. Algorithmic details are similar to a particle filter...

24 Asymptotic Results Theoretical bounds on estimators ensure high quality decisions. N( Ĥ H) d N (0, ) Establish central limit theorem as number of samples N. Show that estimators are consistent and approximately Normal with variance Θ( 1 N ).

25 Finite Sample Bounds In practice, finite sample bounds are preferred over asymptotic results. H(Y ) + b const < ĤY < H(Y ) + b + const where b = KL(p(Y ) ˆp(Y )) w.p. 1 2ɛ Estimates are biased but deviation is bounded w.h.p. through use of robust estimator. Absolute Error Robust upper Empirical upper/lower [Image: Catoni, 2010] ɛ (Confidence Level = 1 2ɛ)

26 Diversions in Nuclear Fuel Cycle [Image: Identify sites for inspection announcement. Performing an intervention to learn causal network structure.

27 Causal Network Inference Nodes interact linearly according to directed acyclic graph (DAG) structure. 8 Interaction Weight: w Directed Acyclic Graph: G Node Observation: X

28 Causal Network Inference Graph structure and interaction weights unknown. 0 Interventions ?

29 Causal Network Inference Covariance Matrix Possible Graphs A B C A A B B C C Cannot determine causality from correlations, need to perform active interventions A C B A B = 0 Clamp node to fixed value. A B C C

30 Causal Network Inference [Image: Cho et al., 2016]

31 Causal Network Inference Model: x j x Pa(j), w j, G N (w j x Pa(j), σ 2 j ) w j G N ( ) G Uniform-DAG Observation G w Graph Parameters Planning: Previous t 1 observations: x(1) x(2) x(3) x(t ) X = {x(1),..., x(t 1)} Select intervention to maximize mutual information: d = arg max I(G; X X, d) d Clamp node x d = 0 and observe remaining nodes. d 1 d 2 d 3 d T

32 Causal Network Inference Robust planning selects most informative interventions in early iterations.

informative experiments than Random In early iterations

33 Causal Network Inference MSE Area Under PRC Area Under ROC Median (solid) best/worst (dashed) out of 50 runs More informative experiments than Random In early iterations chooses more informative experiments compared to Cho et al., 2016

34 1 Summary 8 Sequential Bayesian inference and planning applies to complex models (more interesting applications...) Theoretical guarantees on estimator quality for costly decisions. Scale up to larger problems and bound probability of incorrect selection.

36 Measurement Selection Sensor Types Satellite Flyby Earth-based Sensing Modes EO/SAR/IR Hyperspectral Radio Freq

37 Plan Execution Costs d = arg max I(X; Y d) λr(d) d Information Reward Information/Cost Tradeoff Cost Plan is feasible if information justifies cost: Possible costs for this application: Sensor costs Power consumption I(X; Y d) > λr(d)

38 Monte Carlo Integration Need to compute expected values of the form: E [f(x)] = p(x)f(x)dx Draw samples from the distribution: {x i } N i=1 p(x) Monte Carlo integration is empirical mean E [f(x)] 1 N N f(x i ) i=1 Good statistical properties as N, but problematic for finite samples

39 Asymptotic and Finite Sample Bounds Asymptotic Bounds: [ ]) d σ 2 N Ĥ Y N (H(Y ), E x (p(y X)) y Mp 2 + σ 2 (log p(y )) (Y ) d ( N Ĥ Y X N H(Y X), σ 2 (log p(y X)) ) Finite-Sample Bounds: Assuming N > 2(1 + log ɛ 1 ) H(Y ) + b c < ĤY < H(Y ) + b + c w.p. 1 2ɛ [ b = E log p(y ) ] 1 + log ɛ 1 σ 2 c = ˆp(Y ; X) 1 (1 + log ɛ 1 )/N 2N

40 Causal Network Inference Robust is more consistent in intervention selection. Average realized gain at t = 1 indicates intervention at node 6 is optimal.

Expectation Propagation Algorithm

Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,