Controller Synthesis with UPPAAL-TIGA Alexandre David Kim G. Larsen, Didier Lime, Franck Cassez, Jean-François Raskin
Overview Timed Games. Algorithm (CONCUR 05). Strategies. Code generation. Architecture of UPPAAL-TIGA. Timed Games with Partial Observability. Algorithm (ATVA 07). 18-11-2010 AFSEC 2
Controller Synthesis/TGA Given System moves S, Controller moves C, and a property φ, find a strategy S c s.t. S c S φ, or prove there is no such strategy. 18-11-2010 AFSEC 3
Timed Game Automata Introduced by Maler, Pnueli, Sifakis [Maler & al. 95]. The controller continuously observes the system (all delays & moves are observable). The controller can wait (delay action), take a controllable move, or prevent delay by taking a controllable move. 18-11-2010 AFSEC 4
Timed Game Automata Timed automata with controllable and uncontrollable transitions. Reachability & safety games. control: A<> TGA.goal control: A[] not TGA.L4 Memoryless strategy: state action. 18-11-2010 AFSEC 5
TGA Let s Play! control: A<> TGA.goal x<1 : λ x==1 : c x<2 : λ x 2 : c Strategy x 1 : c x<1 : λ x==1 : c Note: This is one strategy. There are other solutions. 18-11-2010 AFSEC 6
Results [Maler & al. 95, De Alfaro & al. 01] There is a symbolic iterative algorithm to compute the set W* of winning states for timed games. [Henziger & Kopke 99] Safety and reachability control are EXPTIMEcomplete. [Cassez & al. 05] Efficient on-the-fly algorithm for safety & reachability games. 18-11-2010 AFSEC 7
Algorithm On-the-fly forward algorithm with a backward fix-point computation of the winning/losing sets. Use all the features of UPPAAL in forward. Possible to mix forward & backward exploration. Solved by Liu & Smolka 1998 for untimed games. Extended symbolic version at CONCUR 05. 18-11-2010 AFSEC 8
18-11-2010 AFSEC 9
Backward Propagation L0 L1 L2 L3 0 1 2 Backpropagate when goal is reached. Note: This is not a strategy, it s only the set of winning states. 18-11-2010 AFSEC 10
Backward Propagation B G pred t B Predecessors of G avoiding B? 18-11-2010 AFSEC 11
18-11-2010 AFSEC 12 pred t From Federation to Zone U I U U UI = = ) \ ) (( ) \ ( ), ( ), ( ), ( B B G B G B G pred B G pred B G pred t i j i j j i t j i t
Query Language (1) Reachability properties: control: A[ p U q ] control: A<> q control: A[ true U q ] Safety properties: control: A[ p W q ] control: A[] p control: A[ p W false ] Tuning: change search ordering, Back-propagate winning states & BFS+DFS. Back-propagate losing states & BFS-BFS. add back-propagation of winning+losing states. 18-11-2010 AFSEC 13
Query Language (2) Time-optimality control_t*(u,g): A[ p U q ] u is an upper-bound to prune the search, act like an invariant but on the path = expression on the current state. g is the time to the goal from the current state (a lower-bound in fact), also used to prune the search. States with t+g > u are pruned. Cooperative strategies. E<> control: φ Property satisfied iff φ is reachable but the obtained strategy is maximal. 18-11-2010 AFSEC 14
Cooperative Strategies State-space is partitioned between states from which there is a strategy and those from which there is no strategy. Cooperative strategy suggests moves from the opponent that would help the controller. Being used in testing. Maximal partition with winning strategies. 18-11-2010 AFSEC 15
Strategies? The algorithm computes sets of winning and losing states, not strategies. Strategies are computed on top: Take actions that lead to winning states (reachability). Take actions to avoid losing states (safety). Partition states with actions to guarantee progress. This is done on-the-fly and the obtained strategy depends on the exploration order. 18-11-2010 AFSEC 16
Winning States Strategy wait L0 L1 Also possible past L0 L1 L0 L1 L0 wait wait L1 goal past L1 L2 L3 L2 wait L3 L1 L3 0 1 2 Winning states Strategy past 18-11-2010 AFSEC 17
Strategies as Partitions Built on-the-fly. Guarantee progress in the strategy. No loop. Deterministic strategy. Different problem than computing the set of winning states. Different ordering searches can give different strategies with possibly the same set of winning states. 18-11-2010 AFSEC 18
Code Generation Mapping state action. # entries = # states. Decision graph state action. # tests = # variables. More compact. Based on a hybrid BDD/CDD with multi-terminals. 18-11-2010 AFSEC 19
Decision Graph BDD: boolean variables. CDD: constraints on clocks. Multi-terminals: actions. It works because we have a partition. 18-11-2010 AFSEC 20
Graph Reduction Testing consecutive bits: Replace by one testing with a mask. Can span on several variables. 18-11-2010 AFSEC 21
Decision Graph 18-11-2010 AFSEC 22
Pipeline Architecture Pipeline Components Source Sink Data Buffer State Filter Successor 18-11-2010 AFSEC 23
Pipeline Architecture Transition Successor Delay Extrapolation + Source s,f forward. State-graph Inclusion check+add Destination s,b backward. Predecessor Waiting queue s*,b pred t update? s,f s,b win lose? update? 18-11-2010 AFSEC 24
Query Language (4) Partial observable systems: { obs1, obs2 } control: reachability safety Generate a controller based on partial observations of the system (obs1, obs2 ). Games with Buchi accepting states. control: A[] ( p && A<> q ) control: A[] A<> q Franck Cassez JF Raskin Didier Lime 18-11-2010 AFSEC 25
Query Language(5) Simulation checking of TA & TGA. { A, B } <= { C, D } Built-in new algorithm that bypasses a manual encoding we had [LSV tech. report KGL, AD, TC]. Strategies & simulation in the GUI available. FORMATS 09 Thomas Chatain Peter Bulychev 18-11-2010 AFSEC 26
TGA with Buchi Accepting States Use TIGA s algorithm as an intermediate steps in a fixpoint, but modified: winning states are states that can reach goals goal states defined by queries, not necessarily winning Fixpoint: Win = SOTF(goal) while (Win goal) goal goal = Win goal Win = SOTF(goal) done return S 0 Win Didier Lime 18-11-2010 AFSEC 27
Application: Non-zeno Strategies Add a monitor with the rest of the system. Ask control: A[] (p && A<> Monitor.Check) x==1 x=0 u Check 18-11-2010 AFSEC 28
Timed Games with Partial Observability Previous: Perfect information. Not always suitable for controllers. Partial observation. States or events, here states. Distinguish states w.r.t. observations. Strategy keeps track of states w.r.t. observations. Observations = predicates over states. 18-11-2010 AFSEC 29
Results Discrete event systems [Kupferman & Vardi 99, Reif 84, Arnold & al. 03]. Game given as modal logic formula: Fullobservation as hard as partial observation. [Chatterjee & al. 06, De Wulf & al. 06]. Game given as explicit graph: Full-observation PTIME, partial observation EXPTIME. Timed systems, game given as a TA [Cassez & al. 07] Efficient on-the-fly algorithm, EXPTIME. 18-11-2010 AFSEC 30
Franck Cassez ATVA 07 State Based Full Observation 2-player reachability game, controllable + uncontrollable actions. Full observation: in l 2 do c 1, in l 3 do c 2. 18-11-2010 AFSEC 31
Franck Cassez ATVA 07 State Based Partial Observation Partition the state-space l 2 =l 3. Can t win here. 18-11-2010 AFSEC 32
Franck Cassez ATVA 07 State Based Partial Observation 18-11-2010 AFSEC 33
Franck Cassez ATVA 07 Observation For Timed Systems 18-11-2010 AFSEC 34
Franck Cassez ATVA 07 Stuttering-Free Invariant Observations 18-11-2010 AFSEC 35
TGA with PO: Rules If player 1 wants to take an action c, then player 2 can choose to play any of his actions or c as long as the observation stays the same, or player 2 can delay as long as c is not enabled and the observation stays the same. c is urgent. If player 1 wants to delay, then player 2 can delay or take any of his actions as long as the observation stays the same. The turn is back to player 1 as soon as the observation changes. 18-11-2010 AFSEC 36
On-the-Fly Algorithm 18-11-2010 AFSEC 37
Algorithm Partition the state-space w.r.t. observations. Observations O1 O2 O3. Winning/losing is observable. O1 O2 O3 O1 O2 O3 O1 O2 O3 O1 O2 O3 O1 O2 O3 O1 O2 O3 change of observation O1 O2 O3 O1 O2 O3 18-11-2010 AFSEC 38
Algorithm with Reachability Objective 1 W 2 3 18-11-2010 AFSEC 39
Initialization Passed list { sets of W }. Waiting list { tuples (W,α,W ) }. Win[W] maps to 1 (winning) or 0 (unknown). Similar Losing[W]. Depend[W] records the graph. Sink α : sink states by doing α. 18-11-2010 AFSEC 40
Sets of Symbolic States & Successors Next α (W): sets of symbolic successors for some action α. O 1 α Next α (W) O 2 α Next α (W) O 3 Set W of symbolic states within one observation. Next α : do {α asap and wait for α} until change of observation. 18-11-2010 AFSEC 41
Forward Phase Update Win, Losing, graph. Continue forward. Back-propagate if necessary. Detect deadlocks. Partition successors. 18-11-2010 AFSEC 42
Backward Phase If there is a c whose successors are all winning then W is winning and back-propagate. If every c has a losing successor then W is losing and back-propagate. Update graph in case of new paths to passed states. 18-11-2010 AFSEC 43
Algorithm Initial state in some partition. Compute successors { set of states } w.r.t. a controllable action. Successors distinguished by observations. 18-11-2010 AFSEC 44
Algorithm Construct the graph of sets of symbolic states. Back-propagate winning/losing states. 18-11-2010 AFSEC 45
Notes Stuttering-free invariant observations sinks possible in observations (deadlock/livelock/loop) Actions are urgent delay until actions are enabled (or observation changes) 18-11-2010 AFSEC 46
Algorithm Forward exploration constrained by action + observations delay special Back-propagation. If all successors a are winning, declare current state winning, strategy: take action a. If one successor a is losing, avoid action a. If no action is winning the current state is losing. 18-11-2010 AFSEC 47
Example Observations: L, H, E, B, y in [0,1[ 18-11-2010 AFSEC 48
Example Memoryful Strategy 137 295 135 27 31 35 39 43 47 51 55 59 63 67 71 0 165 169 173 177 181 185 189 193 197 201 205 209 162 24 26 30 34 38 42 46 50 54 58 62 166 164 172 176 180 184 188 192 196 200 Partition: y Ly Hy L H Ey E Actions: delay y=0 eject! Controllable with y [0,½[, not with y [0,1[. 18-11-2010 AFSEC 49
Case-Studies [FORMATS 07] Guided Controller Synthesis Using UPPAAL-TIGA. Jan Jakob Jessen, Jacob Illum Rasmussen, Kim G. Larsen, Alexandre David. [HSCC 09] Automatic Synthesis of Robust and Optimal Controllers An Industrial Case Study. Franck Cassez, J.J. Jessen, Kim G. Larsen, Jean-François Raskin, Pierre-Alain Reynier. (Hydac) 18-11-2010 AFSEC 50
18-11-2010 AFSEC 51
18-11-2010 AFSEC 52
18-11-2010 AFSEC 53
18-11-2010 AFSEC 54
18-11-2010 AFSEC 55
18-11-2010 AFSEC 56
18-11-2010 AFSEC 57
18-11-2010 AFSEC 58
18-11-2010 AFSEC 59
18-11-2010 AFSEC 60
18-11-2010 AFSEC 61
18-11-2010 AFSEC 62
Generalizing this Work The case-study ad-hoc translation ad-hoc script ad-hoc interface What we did: Defined the interface (inputs & outputs in Simulink). Rewrote the Ruby translator generic inputs/outputs generic strategies discretizes time. Integrated the workflow inside Simulink. 18-11-2010 AFSEC 63
Workflow Between TIGA & Simulink 18-11-2010 AFSEC 64
Results From a TIGA model and a Simulink model we can generate the S-function that acts as the discrete controller inside Simulink, simulate the model and validate the controller. 18-11-2010 AFSEC 65