Lecture 8 Receding Horizon Temporal Logic Planning & Finite-State Abstraction Ufuk Topcu Nok Wongpiromsarn Richard M. Murray AFRL, 26 April 2012 Contents of the lecture: Intro: Incorporating continuous dynamics & sources of computational complexity Recall: Receding horizon control Receding horizon temporal logic planning (RHTLP) Basic idea & main result Discussion of the key details of implementation Autonomous driving examples Finite-state abstraction & hierarchical control architecture 1
Problem: Design control protocols, that... Handle mixture of discrete and continuous dynamics Account for both high-level specs and low-level constraints Traffic Planner Path Planner ẋ = f(x, u, δ) g(x, u) 0 Reactively respond to changes in environment,... with correctness certificates. [ (ϕ init ϕ env ) (ϕ safety ϕ goal )] 2
Preview Alice s navigation stack Mission Planner Different views long-horizon specification W L Multi-scale models W 0... W L 1 W L W 0 W L 1 Hierarchical control architecture Traffic Planner short-horizon specification Path Planner continuous dynamics& constraints min T t 0 L(x, u)dt s.t. ẋ = f(x, u) g(x, u) 0 Vehicle Actuation TuLiP: Temporal logic planning toolbox (Open source at http://tulip-control.sf.net) [Coming up in the next lecture] 3
This lecture focuses on two of the remaining issues: Incorporating continuous dynamics Computational complexity 4
Computational Complexity Eiffel Tower Supélec Each of these cells may be occupied by an obstacle. The vehicle can be in any of these cells. L (2L)(22L) possible states! 5
Receding Horizon Control t+t min C(x(τ),u)τ))dτ + V (x(t + T )) u [t,t+t ] t subject to: ẋ = f(x, u), x(t) given x(t + T )=x f, g(x, u) 0 Reduces the computational cost by solving smaller problems. Real-time (re)computation improves robustness. 6
Receding Horizon Control finite-horizon If not implemented properly, global optimization terminal cost properties, e.g., stability, are not t+t guaranteed. min C(x(τ),u)τ))dτ + V (x(t + T )) u Increasing T helps for stability at [t,t+t ] t the expense of increased subject to: computational cost. ẋ = f(x, u), x(t) given If the terminal cost is chosen as a control Lyapunov function, i.e., V is (locally) positive definite and satisfy (for some r>0) min( V + C)(x, u) < 0, x {x : V (x) r 2 } u then stability is guaranteed. Alternative (related) approach, imposed contractiveness constraints in short-horizon problems. x(t + T )=x f, g(x, u) 0 7
Receding Horizon for LTL Synthesis Global (long-horizon) specification: (ϕ init ϕ env ) (ϕ safety ϕ goal ) [TAC 11(submitted), HSCC 10] Basic idea: Partition the state space into a partially ordered set ({W j }, ϕg ) ν 10 W 0 Goal-induced partial order ν 9 ν 8 W 1 { ν 7 W 2 Plan from the current cell on { { state satisfying ϕ goal Short-horizon specification: For each i, ((ν W i ) Φ ϕ env ) (Φ ϕ safety (ν F i (W i ))) { Receding horizon invariant: rules out corner cases Get closer to goal rather than reaching. : horizon length F ν 6 W 3 Theorem: Receding horizon implementation of the short-horizon strategies ensures the correctness of the global specification. Trade-offs: ν 4 ν 5 ν 3 W 4 computational cost vs. horizon length vs. strength of invariant vs. conservatism ν 2 ν 1 8
How to come up with a partial order, F and Φ? In general, problem-dependent and requires user guidance. Partial automation is possible (discussed later). Partial order: measure of closeness to the goal, i.e, to the states satisfying. The map F determines the horizon length. W L W L 1 W 0... W L 1 W L F(W j )=W j 2, j 2 F(W j )=W 0, j < 2 W 0 The invariant Φ (in this example) rules out the states that render the short horizon problems unrealizable. In the example above, it is the conjunction of the following propositional formulas on the initial states for each subproblem: no collision in the initial state vehicle cannot be in the left lane unless there is an obstacle in the right lane in the initial state vehicle is able to progress from the initial state 9
Navigation of point-mass omnidirectional vehicle nondimensionalized dynamics: ẍ +ẋ = q x (t) ÿ +ẏ = q y (t) θ + 2mL2 J θ = q θ conservative bounds on control authority to decouple the dynamics: q x (t), q y (t) 0.5 q θ (t) 1 W L W L 1 W 0 Reasons for the non-intuitive trajectories: Synthesis: feasibility rather than optimality. Specifications are not rich enough. Partition (in two consecutive cells): 1 v z 0!1 i!1 i z 10
Example: Navigation In Urban-Like Environment Dynamics: ẋ(t) =u x (t)+d x (t), ẏ(t) =u y (t)+d y (t) Actuation limits: u x (t),u y (t) [ 1, 1], t 0 Disturbances: d x (t),d y (t) [.1,.1], t 0 Traffic rules: No collision Stay in right lane unless blocked by obstacle Proceed through intersection only when clear Environment assumptions: Obstacle may not block a road Obstacle is detected before it gets too close Limited sensing range (2 cells ahead) Obstacle does not disappear when the vehicle is in its vicinity Obstacles don t span more than certain # of consecutive cells in the middle of the road Each intersection is clear infinitely often Cells marked by star and adjacent cells are not occupied by obstacle infinitely often Goals: Visit the cells with * s infinitely often. 11
Navigation In Urban-Like Environment Setup: Dynamics: Fully actuated with actuation limits and bounded disturbances Specifications: Traffic rules Assumptions on obstacles, sensing range, intersections,... Goals: Visit the two stars infinitely often Results: Without receding horizon: 1e87 states (hence, not solvable) Receding horizon: Partial order: From the top layer of the control hierarchy Horizon length = 2 ( F(Wj)=W i j 2. i ) Invariant: Not surrounded by obstacles. If started in left lane, obstacle in right lane. 1e4 states in the automaton. ~1.5 sec for each short-horizon problem Milliseconds for partial order generation Ufuk Topcu 12 response [TAC 11(submit), HSCC 10] Goal Generator Trajectory Planner G
What is Φ? A propositional formula (that we call receding horizon invariant). Used to exclude the initial states that render synthesis infeasible, e.g., states from which collision is unavoidable Short-horizon specification: ((ν W i ) Φ ϕ env ) (Φ ϕ safety (ν F i (W i ))) Given partial order and F, computation of the invariant can be automated: Check realizability If realizable, done. If not, collect violating initiation conditions. Negate them and put in Φ. Repeat until all subproblems or all possible states are excluded (in the latter case, either the global problem is infeasible or RHTLP with given partial order and F is inconclusive.) 13
Generalization to multiple goals General form of LTL specifications considered in reactive control protocol synthesis: ψ init ψ e ψ f,i i I f Each partial order covers the discrete (system) state space. For each ν W i j 0, one can find a cell in the proceeding partial order that ν belongs to. partial order 1 i I s ψ s,i partial order 2 multiple goals { ψ g,i i I g... partial order n W i j Strategy: While in implement (in a receding horizon fashion) the controller that realizes W { } (ν W i j ) Φ ψe e k If ψ e f,k k Is ψ s,k ν F i (Wj i ) Φ, 14
Computational complexity & completeness For Generalized Reactivity [1] formulas, the computation time of synthesis is O(mn Σ 3 ), where Σ is the number of discrete states. m p e i i=1 j=1 Receding horizon implementation... reduces the computational complexity by restricting the state space considered in each subproblem; and is not complete, i.e., the global problem may be solvable but the choice of {W j }, the partial order, the maps F i, and Φmay not lead to a solution. n q s j F i Choose to give longer horizon : Subproblems in RHTLP are more likely to be realizable. Computational cost is higher. E.g., for urban-like driving example is infeasible with horizon length of one. W L W L 1 W 0 Global synthesis problem (ϕ init ϕ env ) (ϕ safety ϕ goal ) Subproblems in RHTLP ((v W i ) Φ ϕ end ) (ϕ safety (v F i (W i ) Φ) 15
Hierarchical control structure models of varying fidelity W L W L 1 W 0 Goal Generator Abstraction procedure and bisimulations relate models of different fidelity level. (How: see the coming slides.) 1 v z 0 Trajectory Planner env ẍ +ẋ = q x (t) ÿ +ẏ = q y (t) θ + 2mL2 J θ = q θ!1 i!1 i z q x (t), q y (t) 0.5 q θ (t) 1 Continuous Controller u sd noise δu Plant Local Control Response mechanism is introduced to compensate for mismatch between the system and its model and between the actual behavior of the environment and its assumptions. 16
Incorporating continuous dynamics: main idea Main idea: response Trajectory Planner ν ν 6 ν 7 ν 8 ν 9 ν 10 ν 1 ν 2 ν 3 ν 4 ν 5 Continuous Controller ξ = f(ξ,w,u) ξ X, u U,w W X ν 6 ν 7 ν 8 ν 9 ν 10 ν 1 ν 2 ν 3 ν 4 ν 5 Theorem: For any discrete run satisfying the specification, there exists an admissible control signal leading to a continuous trajectory satisfying the specification. Proof: Constructive Finite-state model + Continuous control signals. Abstraction refinement for reducing potential conservatism. 17
Given: A system with controlled variables s S in domain dom(s) and environment variables e E in domain dom(e). Define v =(s, e), V = S E and dom(v ) =dom(s) dom(e). Controlled variables evolve with (for t = 0,1,2,...): s[t + 1] = As[t]+B u u[t]+b d d[t] u[t] U d[t] D s[0] dom(s) s[t + 1] dom(s) } Finite state abstraction state evolution admissible control inputs exogenous disturbances set that states take values in System specification ϕ v 5 ν 5 Find: A finite transition system with discrete states such that for any sequence ν 0 ν 1... satisfying ϕ, (very roughly speaking) there exists a sequence of admissible control signals leading to an infinite sequence v 0 v 1 v 2... that satisfies ϕ. (stated more precisely later...) ν ν 4 v 4 ν 3 v 3 ν 2 v 2 ν 1 v 1 ν 0 v 0 18
Proposition preserving partition y Given dom(v ) and atomic propositions in Π. A partition of dom(v ) is said to be proposition preserving if, for any atomic proposition π Π and any states v and v that belong to the same cell of the partition, v satisfies π if and only if v satisfies π. Example: 1 Π = {x 1,y 0,x+ y 0,...} 0-1 -2 x 1 2 A discrete state ν is said to satisfy π if and only if there exists a continuous state v, in the cell labeled, that satisfies π. v ν 9 ν 8 ν 7 ν 6 ν 5 v ν 5 d π v ν 5 s.t. v π ν 4 ν 3 ν 2 ν 1 ν 0 + proposition preserving: v π v π ν 5 d π v ν 5 s.t. v π 19
A discrete state ν j is finite-time reachable from a discrete state ν i, only if starting from any s[0] Ts 1 (ν i ), there exists - a finite horizon length N {0, 1,...} - for any allowable disturbance, there exists. u[0],u[1],...,u[n 1] U such that s[t] T 1 s Finite-time reachability s[n] Ts 1 (ν j ) (ν j ), t {0,...,N} (ν i ) T 1 s ν 9 ν 4 ν 8 ν 3 ν 7 ν 2 ν 6 ν 1 ν 0 ν 5 S 0 Verifying the reachability relation: Compute the set S 0 of s[0] from which T s (ν j ) can be reached under the system dynamics in a pre-specified time N. Check whether T 1. s (ν i ) S 0 system dynamics {s[t + 1] = As[t] +Buu[t] + Bdd[t] u[t] U d[t] D s[0] dom(s) s[t + 1] dom(s) 20
Given N and polyhedral sets Ts 1 (ν i )={s R n : L 1 s M 1 } U = {u R m : L 2 u M 2 } Ts 1 (ν j )={s R n : L 3 s M 3 }. S 0 is computed as the set of s 0 such that there exist u[0],...,u[n 1] satisfying L 2 u[t] M 2, for t {0,...,N 1}, leading to where Computing L 1 s[t] M 1 for t =0,...,N 1 L 3 s[n] M 3, S 0 affine in s0 and u s[t] =A t s 0 + t 1 k=0 A k B u u[t 1 k]+a k B d d[t 1 k], ν 8 ν 7 ν 6 ν 5 ν 3 ν 2 ν 1 ν 0 T 1 s (ν 1 ) T 1 s (ν 0 ) for all d[0],...,d[n 1] D (D polyhedral). Put together: S 0 is computed as a polytope projection: S 0 = s 0 R n : û R mn s0 s.t. L M G û ˆd, ˆd D N stacking of u and d set of vertices of D N = D D 21
Refining the partition ν 7 ν 6 ν 5 ν 10 ν 11 ν 2 ν 1 ν 0 T 1 s (ν 0 ) S 0 T 1 s (ν 0 ) S c 0 While checking the reachability from Ts 1 (ν j ), if Ts 1 (ν i ) S 0, then T 1 s - Split Ts 1 (ν i ) S 0 and Ts 1 (ν i ) S0 c - Remove ν i from the set of discrete states (ν i ) to - Add two new discrete states corresponding to. Ts 1 (ν i ) S 0 and Ts 1 (ν i ) S0 c Repeat until no cell can be sub-partitioned s.t. the volumes of the two resulting new cells both greater than V ol min. Smaller V ol min leads to more cells in the partition and more allowable transitions. If the initial partition is proposition preserving, so is the resulting. Define the finite transition system D, an abstraction of S as: - V := S E, set of discrete states. (both controller and environment) - ν i =(ς i, i ) v j =(ς j, j ) only if ς j. is reachable from ς i. 22
ν 7 ν 6 ν 5 v 8 v 7 ν Correctness 2 of the hierarchical implementation v 6 v 4 v 5 v 2 v 3 ν 1 ν 0 v 1 v 0 Using Proposition preserving property of the partition D only includes the transitions that are implemented by the control signal u within some finite time (by construction through the reachability formulation) Stutter invariance of the specification ϕ,... Two words σ 1 and σ 2 over 2 AP are stutter equivalent, if there exists an infinite sequence A 0 A 1 A 2... of sets of atomic propositions and natural numbers n 0,n 1,n 2,... and m 0,m 1,m 2,... such that σ 1 and σ 2 are of the form σ 1 = A n 0 0 An 1 1 An 2 2... σ 2 = A m 0 0 Am 1 1 Am 2 2... An LT property P is stutter-invariant if for any word σ P all stutter-equivalent words are also contained in P. Example: v 0 v 1...v 8... and ν 0 ν 1... are stutter-equivalent....we can prove: Let σ d = ν 0 ν 1... be a sequence in D with ν k ν k+1, ν k =(ς k, k ), ς k S and k E. If σ d = d ϕ, then by applying a sequence of control signals from the Reachability Problem with initial set Ts 1 (ς k ) and final set Ts 1 (ς k+1 ), the sequence of continuous states σ = v 0 v 1 v 2... satisfies ϕ. 23
Summary Alice s navigation stack Mission Planner Different views long-horizon specification W L Multi-scale models W 0... W L 1 W L W 0 W L 1 Hierarchical control architecture Traffic Planner short-horizon specification Path Planner continuous dynamics& constraints min T t 0 L(x, u)dt s.t. ẋ = f(x, u) g(x, u) 0 Vehicle Actuation TuLiP: Temporal logic planning toolbox (Open source at http://tulip-control.sf.net) [Coming up in the next lecture] 24