Controller Synthesis with UPPAAL-TIGA. Alexandre David Kim G. Larsen, Didier Lime, Franck Cassez, Jean-François Raskin

Controller Synthesis with UPPAAL-TIGA Alexandre David Kim G. Larsen, Didier Lime, Franck Cassez, Jean-François Raskin

Overview Timed Games. Algorithm (CONCUR 05). Strategies. Code generation. Architecture of UPPAAL-TIGA. Timed Games with Partial Observability. Algorithm (ATVA 07). 18-11-2010 AFSEC 2

Controller Synthesis/TGA Given System moves S, Controller moves C, and a property φ, find a strategy S c s.t. S c S φ, or prove there is no such strategy. 18-11-2010 AFSEC 3

Timed Game Automata Introduced by Maler, Pnueli, Sifakis [Maler & al. 95]. The controller continuously observes the system (all delays & moves are observable). The controller can wait (delay action), take a controllable move, or prevent delay by taking a controllable move. 18-11-2010 AFSEC 4

Timed Game Automata Timed automata with controllable and uncontrollable transitions. Reachability & safety games. control: A<> TGA.goal control: A[] not TGA.L4 Memoryless strategy: state action. 18-11-2010 AFSEC 5

TGA Let s Play! control: A<> TGA.goal x<1 : λ x==1 : c x<2 : λ x 2 : c Strategy x 1 : c x<1 : λ x==1 : c Note: This is one strategy. There are other solutions. 18-11-2010 AFSEC 6

Results [Maler & al. 95, De Alfaro & al. 01] There is a symbolic iterative algorithm to compute the set W* of winning states for timed games. [Henziger & Kopke 99] Safety and reachability control are EXPTIMEcomplete. [Cassez & al. 05] Efficient on-the-fly algorithm for safety & reachability games. 18-11-2010 AFSEC 7

Algorithm On-the-fly forward algorithm with a backward fix-point computation of the winning/losing sets. Use all the features of UPPAAL in forward. Possible to mix forward & backward exploration. Solved by Liu & Smolka 1998 for untimed games. Extended symbolic version at CONCUR 05. 18-11-2010 AFSEC 8

18-11-2010 AFSEC 9

Backward Propagation L0 L1 L2 L3 0 1 2 Backpropagate when goal is reached. Note: This is not a strategy, it s only the set of winning states. 18-11-2010 AFSEC 10

Backward Propagation B G pred t B Predecessors of G avoiding B? 18-11-2010 AFSEC 11

18-11-2010 AFSEC 12 pred t From Federation to Zone U I U U UI = = ) \ ) (( ) \ ( ), ( ), ( ), ( B B G B G B G pred B G pred B G pred t i j i j j i t j i t

Query Language (1) Reachability properties: control: A[ p U q ] control: A<> q control: A[ true U q ] Safety properties: control: A[ p W q ] control: A[] p control: A[ p W false ] Tuning: change search ordering, Back-propagate winning states & BFS+DFS. Back-propagate losing states & BFS-BFS. add back-propagation of winning+losing states. 18-11-2010 AFSEC 13

Query Language (2) Time-optimality control_t*(u,g): A[ p U q ] u is an upper-bound to prune the search, act like an invariant but on the path = expression on the current state. g is the time to the goal from the current state (a lower-bound in fact), also used to prune the search. States with t+g > u are pruned. Cooperative strategies. E<> control: φ Property satisfied iff φ is reachable but the obtained strategy is maximal. 18-11-2010 AFSEC 14

Cooperative Strategies State-space is partitioned between states from which there is a strategy and those from which there is no strategy. Cooperative strategy suggests moves from the opponent that would help the controller. Being used in testing. Maximal partition with winning strategies. 18-11-2010 AFSEC 15

Strategies? The algorithm computes sets of winning and losing states, not strategies. Strategies are computed on top: Take actions that lead to winning states (reachability). Take actions to avoid losing states (safety). Partition states with actions to guarantee progress. This is done on-the-fly and the obtained strategy depends on the exploration order. 18-11-2010 AFSEC 16

Winning States Strategy wait L0 L1 Also possible past L0 L1 L0 L1 L0 wait wait L1 goal past L1 L2 L3 L2 wait L3 L1 L3 0 1 2 Winning states Strategy past 18-11-2010 AFSEC 17

Strategies as Partitions Built on-the-fly. Guarantee progress in the strategy. No loop. Deterministic strategy. Different problem than computing the set of winning states. Different ordering searches can give different strategies with possibly the same set of winning states. 18-11-2010 AFSEC 18

Code Generation Mapping state action. # entries = # states. Decision graph state action. # tests = # variables. More compact. Based on a hybrid BDD/CDD with multi-terminals. 18-11-2010 AFSEC 19

Decision Graph BDD: boolean variables. CDD: constraints on clocks. Multi-terminals: actions. It works because we have a partition. 18-11-2010 AFSEC 20

Graph Reduction Testing consecutive bits: Replace by one testing with a mask. Can span on several variables. 18-11-2010 AFSEC 21

Decision Graph 18-11-2010 AFSEC 22

Pipeline Architecture Pipeline Components Source Sink Data Buffer State Filter Successor 18-11-2010 AFSEC 23

Pipeline Architecture Transition Successor Delay Extrapolation + Source s,f forward. State-graph Inclusion check+add Destination s,b backward. Predecessor Waiting queue s*,b pred t update? s,f s,b win lose? update? 18-11-2010 AFSEC 24

Query Language (4) Partial observable systems: { obs1, obs2 } control: reachability safety Generate a controller based on partial observations of the system (obs1, obs2 ). Games with Buchi accepting states. control: A[] ( p && A<> q ) control: A[] A<> q Franck Cassez JF Raskin Didier Lime 18-11-2010 AFSEC 25

Query Language(5) Simulation checking of TA & TGA. { A, B } <= { C, D } Built-in new algorithm that bypasses a manual encoding we had [LSV tech. report KGL, AD, TC]. Strategies & simulation in the GUI available. FORMATS 09 Thomas Chatain Peter Bulychev 18-11-2010 AFSEC 26

TGA with Buchi Accepting States Use TIGA s algorithm as an intermediate steps in a fixpoint, but modified: winning states are states that can reach goals goal states defined by queries, not necessarily winning Fixpoint: Win = SOTF(goal) while (Win goal) goal goal = Win goal Win = SOTF(goal) done return S 0 Win Didier Lime 18-11-2010 AFSEC 27

Application: Non-zeno Strategies Add a monitor with the rest of the system. Ask control: A[] (p && A<> Monitor.Check) x==1 x=0 u Check 18-11-2010 AFSEC 28

Timed Games with Partial Observability Previous: Perfect information. Not always suitable for controllers. Partial observation. States or events, here states. Distinguish states w.r.t. observations. Strategy keeps track of states w.r.t. observations. Observations = predicates over states. 18-11-2010 AFSEC 29

Results Discrete event systems [Kupferman & Vardi 99, Reif 84, Arnold & al. 03]. Game given as modal logic formula: Fullobservation as hard as partial observation. [Chatterjee & al. 06, De Wulf & al. 06]. Game given as explicit graph: Full-observation PTIME, partial observation EXPTIME. Timed systems, game given as a TA [Cassez & al. 07] Efficient on-the-fly algorithm, EXPTIME. 18-11-2010 AFSEC 30

Franck Cassez ATVA 07 State Based Full Observation 2-player reachability game, controllable + uncontrollable actions. Full observation: in l 2 do c 1, in l 3 do c 2. 18-11-2010 AFSEC 31

Franck Cassez ATVA 07 State Based Partial Observation Partition the state-space l 2 =l 3. Can t win here. 18-11-2010 AFSEC 32

Franck Cassez ATVA 07 State Based Partial Observation 18-11-2010 AFSEC 33

Franck Cassez ATVA 07 Observation For Timed Systems 18-11-2010 AFSEC 34

Franck Cassez ATVA 07 Stuttering-Free Invariant Observations 18-11-2010 AFSEC 35

TGA with PO: Rules If player 1 wants to take an action c, then player 2 can choose to play any of his actions or c as long as the observation stays the same, or player 2 can delay as long as c is not enabled and the observation stays the same. c is urgent. If player 1 wants to delay, then player 2 can delay or take any of his actions as long as the observation stays the same. The turn is back to player 1 as soon as the observation changes. 18-11-2010 AFSEC 36

On-the-Fly Algorithm 18-11-2010 AFSEC 37

Algorithm Partition the state-space w.r.t. observations. Observations O1 O2 O3. Winning/losing is observable. O1 O2 O3 O1 O2 O3 O1 O2 O3 O1 O2 O3 O1 O2 O3 O1 O2 O3 change of observation O1 O2 O3 O1 O2 O3 18-11-2010 AFSEC 38

Algorithm with Reachability Objective 1 W 2 3 18-11-2010 AFSEC 39

Initialization Passed list { sets of W }. Waiting list { tuples (W,α,W ) }. Win[W] maps to 1 (winning) or 0 (unknown). Similar Losing[W]. Depend[W] records the graph. Sink α : sink states by doing α. 18-11-2010 AFSEC 40

Sets of Symbolic States & Successors Next α (W): sets of symbolic successors for some action α. O 1 α Next α (W) O 2 α Next α (W) O 3 Set W of symbolic states within one observation. Next α : do {α asap and wait for α} until change of observation. 18-11-2010 AFSEC 41

Forward Phase Update Win, Losing, graph. Continue forward. Back-propagate if necessary. Detect deadlocks. Partition successors. 18-11-2010 AFSEC 42

Backward Phase If there is a c whose successors are all winning then W is winning and back-propagate. If every c has a losing successor then W is losing and back-propagate. Update graph in case of new paths to passed states. 18-11-2010 AFSEC 43

Algorithm Initial state in some partition. Compute successors { set of states } w.r.t. a controllable action. Successors distinguished by observations. 18-11-2010 AFSEC 44

Algorithm Construct the graph of sets of symbolic states. Back-propagate winning/losing states. 18-11-2010 AFSEC 45

Notes Stuttering-free invariant observations sinks possible in observations (deadlock/livelock/loop) Actions are urgent delay until actions are enabled (or observation changes) 18-11-2010 AFSEC 46

Algorithm Forward exploration constrained by action + observations delay special Back-propagation. If all successors a are winning, declare current state winning, strategy: take action a. If one successor a is losing, avoid action a. If no action is winning the current state is losing. 18-11-2010 AFSEC 47

Example Observations: L, H, E, B, y in [0,1[ 18-11-2010 AFSEC 48

Example Memoryful Strategy 137 295 135 27 31 35 39 43 47 51 55 59 63 67 71 0 165 169 173 177 181 185 189 193 197 201 205 209 162 24 26 30 34 38 42 46 50 54 58 62 166 164 172 176 180 184 188 192 196 200 Partition: y Ly Hy L H Ey E Actions: delay y=0 eject! Controllable with y [0,½[, not with y [0,1[. 18-11-2010 AFSEC 49

Case-Studies [FORMATS 07] Guided Controller Synthesis Using UPPAAL-TIGA. Jan Jakob Jessen, Jacob Illum Rasmussen, Kim G. Larsen, Alexandre David. [HSCC 09] Automatic Synthesis of Robust and Optimal Controllers An Industrial Case Study. Franck Cassez, J.J. Jessen, Kim G. Larsen, Jean-François Raskin, Pierre-Alain Reynier. (Hydac) 18-11-2010 AFSEC 50

18-11-2010 AFSEC 51

18-11-2010 AFSEC 52

18-11-2010 AFSEC 53

18-11-2010 AFSEC 54

18-11-2010 AFSEC 55

18-11-2010 AFSEC 56

18-11-2010 AFSEC 57

18-11-2010 AFSEC 58

18-11-2010 AFSEC 59

18-11-2010 AFSEC 60

18-11-2010 AFSEC 61

18-11-2010 AFSEC 62

Generalizing this Work The case-study ad-hoc translation ad-hoc script ad-hoc interface What we did: Defined the interface (inputs & outputs in Simulink). Rewrote the Ruby translator generic inputs/outputs generic strategies discretizes time. Integrated the workflow inside Simulink. 18-11-2010 AFSEC 63

Workflow Between TIGA & Simulink 18-11-2010 AFSEC 64

Results From a TIGA model and a Simulink model we can generate the S-function that acts as the discrete controller inside Simulink, simulate the model and validate the controller. 18-11-2010 AFSEC 65