Multi-Pursuer Single-Evader Differential Games with Limited Observations*

Similar documents
Distributed Game Strategy Design with Application to Multi-Agent Formation Control

Game of Defending a Target: A Linear Quadratic Differential Game Approach

Complex Laplacians and Applications in Multi-Agent Systems

Average-Consensus of Multi-Agent Systems with Direct Topology Based on Event-Triggered Control

Linear Quadratic Zero-Sum Two-Person Differential Games Pierre Bernhard June 15, 2013

Distributed Coordinated Tracking With Reduced Interaction via a Variable Structure Approach Yongcan Cao, Member, IEEE, and Wei Ren, Member, IEEE

A New Algorithm for Solving Cross Coupled Algebraic Riccati Equations of Singularly Perturbed Nash Games

MULTI-AGENT TRACKING OF A HIGH-DIMENSIONAL ACTIVE LEADER WITH SWITCHING TOPOLOGY

Tracking control for multi-agent consensus with an active leader and variable topology

Formation Control of Nonholonomic Mobile Robots

Distributed Receding Horizon Control of Cost Coupled Systems

Linear Quadratic Zero-Sum Two-Person Differential Games

Min-max Time Consensus Tracking Over Directed Trees

Multi-Robotic Systems

Game Theory Extra Lecture 1 (BoB)

NCS Lecture 8 A Primer on Graph Theory. Cooperative Control Applications

ON SEPARATION PRINCIPLE FOR THE DISTRIBUTED ESTIMATION AND CONTROL OF FORMATION FLYING SPACECRAFT

Decentralized Stabilization of Heterogeneous Linear Multi-Agent Systems

Graph Theoretic Methods in the Stability of Vehicle Formations

Delay-independent stability via a reset loop

Consensus of Information Under Dynamically Changing Interaction Topologies

Stabilizing a Multi-Agent System to an Equilateral Polygon Formation

Consensus Protocols for Networks of Dynamic Agents

Floor Control (kn) Time (sec) Floor 5. Displacement (mm) Time (sec) Floor 5.

Zeno-free, distributed event-triggered communication and control for multi-agent average consensus

Consensus Problem in Multi-Agent Systems with Communication Channel Constraint on Signal Amplitude

Consensus Tracking for Multi-Agent Systems with Nonlinear Dynamics under Fixed Communication Topologies

Nash Equilibria in a Group Pursuit Game

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games

Adaptive State Feedback Nash Strategies for Linear Quadratic Discrete-Time Games

A Guaranteed Cost LMI-Based Approach for Event-Triggered Average Consensus in Multi-Agent Networks

Decentralized Control of Nonlinear Multi-Agent Systems Using Single Network Adaptive Critics

Automatica. Distributed discrete-time coordinated tracking with a time-varying reference state and limited communication

Fixed Duration Pursuit-Evasion Differential Game with Integral Constraints

Formation Control and Network Localization via Distributed Global Orientation Estimation in 3-D

Lecture Note 7: Switching Stabilization via Control-Lyapunov Function

Consensus Algorithms are Input-to-State Stable

Scaling the Size of a Multiagent Formation via Distributed Feedback

ANALYSIS OF CONSENSUS AND COLLISION AVOIDANCE USING THE COLLISION CONE APPROACH IN THE PRESENCE OF TIME DELAYS. A Thesis by. Dipendra Khatiwada

Distance-based Formation Control Using Euclidean Distance Dynamics Matrix: Three-agent Case

An Even Order Symmetric B Tensor is Positive Definite

Convergence Rate of Nonlinear Switched Systems

Mathematical Relationships Between Representations of Structure in Linear Interconnected Dynamical Systems

and the nite horizon cost index with the nite terminal weighting matrix F > : N?1 X J(z r ; u; w) = [z(n)? z r (N)] T F [z(n)? z r (N)] + t= [kz? z r

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 53, NO. 7, AUGUST /$ IEEE

Pursuit, herding and evasion: A three-agent model of caribou predation

Fast Linear Iterations for Distributed Averaging 1

Multi-agent avoidance control using an M-matrix property

Riccati difference equations to non linear extended Kalman filter constraints

Networked Control Systems, Event-Triggering, Small-Gain Theorem, Nonlinear

Subject: Optimal Control Assignment-1 (Related to Lecture notes 1-10)

Formation Stabilization of Multiple Agents Using Decentralized Navigation Functions

Near-Potential Games: Geometry and Dynamics

CONVERGENCE PROOF FOR RECURSIVE SOLUTION OF LINEAR-QUADRATIC NASH GAMES FOR QUASI-SINGULARLY PERTURBED SYSTEMS. S. Koskie, D. Skataric and B.

Results on stability of linear systems with time varying delay

Consensus Control of Multi-agent Systems with Optimal Performance

The norms can also be characterized in terms of Riccati inequalities.

Combining distance-based formation shape control with formation translation

Optimal Linear Iterations for Distributed Agreement

Theory and Applications of Matrix-Weighted Consensus

arxiv: v2 [cs.ro] 26 Sep 2016

Consensus of Hybrid Multi-agent Systems

A Graph-Theoretic Characterization of Controllability for Multi-agent Systems

A Distributed Newton Method for Network Utility Maximization, I: Algorithm

Robust Adaptive Motion Planning in the Presence of Dynamic Obstacles

OUTPUT CONSENSUS OF HETEROGENEOUS LINEAR MULTI-AGENT SYSTEMS BY EVENT-TRIGGERED CONTROL

Lecture 4: Introduction to Graph Theory and Consensus. Cooperative Control Applications

Information Structures Preserved Under Nonlinear Time-Varying Feedback

Automatica. Relay pursuit of a maneuvering target using dynamic Voronoi diagrams

Near-Potential Games: Geometry and Dynamics

Positive Markov Jump Linear Systems (PMJLS) with applications

arxiv: v1 [math.oc] 30 Apr 2015

Sliding mode control for coordination in multi agent systems with directed communication graphs

Agreement Problems in Networks with Directed Graphs and Switching Topology

Nonlinear Control Design for Linear Differential Inclusions via Convex Hull Quadratic Lyapunov Functions

A MATRIX INEQUALITY BASED DESIGN METHOD FOR CONSENSUS PROBLEMS IN MULTI AGENT SYSTEMS

arxiv: v2 [math.oc] 14 Dec 2015

Risk-Sensitive and Robust Mean Field Games

Active Passive Networked Multiagent Systems

Stability Analysis of Stochastically Varying Formations of Dynamic Agents

Disturbance Attenuation Properties for Discrete-Time Uncertain Switched Linear Systems

Observer design for a general class of triangular systems

Convex Optimization. Problem set 2. Due Monday April 26th

I = i 0,

Hegselmann-Krause Dynamics: An Upper Bound on Termination Time

Distributed Tracking ControlforLinearMultiagent Systems With a Leader of Bounded Unknown Input

EE291E/ME 290Q Lecture Notes 8. Optimal Control and Dynamic Games

Decentralized Control of Vehicle Formations

Distributed Adaptive Consensus Protocol with Decaying Gains on Directed Graphs

Dynamic and Adversarial Reachavoid Symbolic Planning

OPTIMAL CONTROL OF SWITCHING SURFACES IN HYBRID DYNAMIC SYSTEMS. Mauro Boccadoro Magnus Egerstedt,1 Yorai Wardi,1

Problem Description The problem we consider is stabilization of a single-input multiple-state system with simultaneous magnitude and rate saturations,

Consensus of Multi-Agent Systems with

Discrete Double Integrator Consensus

EE363 homework 7 solutions

EXPERIMENTAL ANALYSIS OF COLLECTIVE CIRCULAR MOTION FOR MULTI-VEHICLE SYSTEMS. N. Ceccarelli, M. Di Marco, A. Garulli, A.

A Residual Gradient Fuzzy Reinforcement Learning Algorithm for Differential Games

Robust control and applications in economic theory

6.4 Krylov Subspaces and Conjugate Gradients

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Transcription:

American Control Conference (ACC) Washington DC USA June 7-9 Multi-Pursuer Single- Differential Games with Limited Observations* Wei Lin Student Member IEEE Zhihua Qu Fellow IEEE and Marwan A. Simaan Life Fellow IEEE Abstract In this paper closed-loop Nash equilibrium strategies for an N-pursuer single-evader differential game over a finite time horizon with limited observations is considered. The game setting is such that each pursuer has limited sensing range and can observe the state vector of another player only if that player is within the pursuer s sensing range. The evader on the other hand has unlimited sensing range which allows it to observe the state of all pursuers at all times and implement a standard closed-loop Nash strategy. To derive strategies for the pursuers a new concept of best achievable performance indices is proposed. These indices are derived in a way to be the closest to the original performance indices and such that the resulting pursuers collective strategy satisfies a Nash equilibrium against the evader s strategy. The strategies obtained by such an approach are independent of the initial state vector. An illustrative example is solved and simulation results corresponding to different sensing ranges and performance indices of the game are presented. I. INTRODUCTION The pursuit-evasion game has been considered for decades as a model for illustrating a variety of applications of differential game theory [ [. Basically it models the process one or more pursuers try to chase one or more evaders while the evaders try to escape. Solving a pursuitevasion game essentially involves developing strategies for the pursuers and evaders such that their prescribed performance indices are optimized. Historically pursuit-evasion games were first investigated by Isaacs [ in the 9s. Saddle point solutions for a type of zero-sum single-pursuer single-evader games were considered in [. Nonzero-sum pursuit-evasion games were introduced and investigated as an example of the Nash equilibrium strategies in [ and as an example of leader-follower Stackelberg strategies in [. Two pursuers and one evader games were studied in [. Recently there has been considerable interest in problems that involve many pursuers chasing one evader [6 [7 [8 [9 [ [. These types of games present some interesting challenges in designing coordinated strategies among all the pursuers to accomplish the goal of capturing the evader. From a practical point of view however it is important to consider games with state information constraints global information is not available to all players. In this paper we consider an N-pursuer single-evader game over a finite time horizon only the evader has global information with the ability *This work is supported in part by a grant (CCF-96) from National Science Foundation. W. Lin Z. Qu and M. A. Simaan are with the Department of EECS University of Central Florida Orlando Florida 86 US- A. Emails: weilin99@gmail.com qu@eecs.ucf.edu simaan@eecs.ucf.edu. to observe all the pursuers. The pursuers on the other hand have limited sensing capabilities. Each pursuer is only able to observe players that fall within its sensing range during any interval of time in the game. This might necessitate that each pursuer needs to implement a strategy based on the information available to it. Most of the results on pursuitevasion games so far are built upon the assumption that all players must have global or full information about all the players in the game in order to implement their closed-loop Nash strategies. Some of the recent papers on this subject have considered limited information structures either by the pursuers or by the evader [6 [9 [. In this paper we consider a game between a single evader with a high sensing capability outnumbered by several pursuers but with poor sensing capabilities. A practical example of such a situation occurs when a well-equipped drone with a very wide range of sensing capability must evade several weakly-equipped pursuing air vehicles. These types of problems need to be treated using an approach different from what currently exists in the literature. This paper builds on previous results obtained in [ and further explores how the inverse optimality can be used to secure a Nash equilibrium between the pursuers and evader. The remainder of the paper is organized as follows. Problem formulation is presented in Section II. The evader s Nash equilibrium strategy based on a global information structure is derived in Section III. The Nash strategies for the pursuers with limited observations are analyzed and derived in Section IV. An illustrative example and simulation results are shown in Section V. The paper is concluded in Section VI. II. PROBLEM FORMULATION The differential game considered in this paper is between N pursuers and a single evader in the unbounded Euclidean space. We assume that the state of each player is governed by the differential equation ẋ i = Fx i +G i u i () for i = e N vector x i R n is the state variable u i R mi is the control input and matrices F and G i are of proper dimensions. Subscript e stands for the evader and subscripts to N stand for pursuers to N. We assume that all players have the same dynamics (i.e. same F matrix) which allows for an explicit derivation of the dynamics of a vector z = [z T zt N T z i = x i x e whose entries represent the differences (or displacement) between the evader and each of the pursuers state vectors. 978--799-78-/$. AACC 7

Hence the overall dynamics becomes ż = Az +B e u e + N B i u i = Az +B e u e +B p u p () i= u p = [u T u T N T A = I N F I N is N N identity matrix is the Kronecker product B e = N G e N is a N vector with all the elements equal to B j = d j G j d j is an N vector with the jth element equal to and all other elements equal to B p = blkdiag{g G N } and blkdiag stands for block diagonal matrix. We assume that the collective objective of the pursuers is to minimize a quadratic function of the vector z while at the same time minimizing their energy costs. Hence denoting x i M = xt i Mx i we assume that all the pursuers try to minimize a common performance index: J p = z(t) M p + tf ( z Q p + u p R p )dt () M p = blkdiag{m p M pn } Q p = blkdiag{q p Q pn } R p = blkdiag{r p R pn }. Matrices M pj R n n and Q pj R n n are positivesemidefinite and matrix R j R mj mj is positive definite for j = N. Similarly we assume that the objective of the evader is to maximize a quadratic function of the vector z while at the same time keeping its energy costs at a minimum. Hence the evader will try to minimize the performance index: J e = z(t) M e + ( z Q e + u e R e )dt () M e = blkdiag{m e M en } Q e = blkdiag{q e Q en }. Matrices M ej R n n and Q ej R n n are negativesemidefinite for j = N and matrix R e R me me is positive definite. Given system dynamics () and performance indices ()-() a differential nonzero-sum game problem is formed. The Nash equilibrium for this game is defined as follows. Definition : For the N-pursuer single-evader differential game described in () with pursuers minimizing performance index () and evader minimizing performance index () the strategy set {u pu e} is a Nash equilibrium if the inequalities J i (u p u e ) J i(u p u e ) () J e (u pu e) J e (u pu e ) (6) hold for any u p U p and u e U e U p and U e are the admissible strategy sets for the pursuers and evader respectively. III. NASH STRATEGIES FOR THE EVADER From the perspective of the evader since it has a wide enough sensing range to observe all the pursuers it will naturally adopt its closed-loop Nash strategy derived from () () and () without knowing that the pursuers only have limited observations. Therefore according to well known Nash equilibrium result from [ on linear quadratic differential games the evader s strategy is u e = Re Be T P e z (7) P e is obtained by solving the coupled differential Riccati equations P e +Q e +P e A+A T P e P e B p Rp BT p P p P p B p Rp Bp T P e P e B e Re Be T P e =. (8a) P p +Q p +P p A+A T P p P p B p Rp BT p P p P p B e Re Be T P e P e B e Re Be T P p = (8b) with boundary condition P p (T) = M p and P e (T) = M e. IV. NASH STRATEGIES FOR THE PURSUERS Since the evader will implement its Nash strategy in (7) clearly the best strategy for the pursuers is to implement the corresponding Nash strategy u p = Rp BT P p z (9) P p is obtained from the solution of (8). However the pursuers are generally not able to implement (9) because the ith element of u p in (9) meaning the Nash strategy for pursuer i is generally a full state feedback of z and hence requires pursuerito have global information. Therefore with the constraint of limited observations a different strategy for the pursuers must be implemented. This strategy must form a new Nash equilibrium together with the strategy (7) of the evader. To accurately model the sensing capabilities of the pursuers we define a sensing radius of pursuer i as r i. This radius will determine whether the full state vector of another player in the game is observable to that pursuer or not. Consequently we define a time dependent Laplacian matrix [ for the N pursuers as follows: L (t) L N (t) L(t) =..... () L N (t) L NN (t) if Λz i (t) Λz j (t) r i for j i if Λz i (t) Λz j (t) > r i for j i L ij (t) = N L il if j = i l=l i Also define a vector h(t) = [h T (t) ht N (t)t whose ith entryh i (t) represents the ability of the ith pursuer to observe the evader at time t. That is { if Λzi (t) r h i (t) = i () if Λz i (t) > r i 7

In () and ()Λz i (t) represents the coordinates of pursuer i relative to the position of the evader in vector z i (t) at time t. Further we assume that if N N i= Λz i ǫ ǫ is a capture radius then the game terminates and the evader is assumed to be captured. We assume that the strategy for pursuer i is chosen in the following form N u i =K i h i z i +K i L ij (z i z j ). () j= The first term in the above expression represents a control component to chase the evader directly if pursuer i observes the evader. The second term in the expression represents a component as a feedback control of the difference between pursuer i and its neighboring pursuers state variables. In general pursueri s control could be chosen in a more general structure than the one in () however in this paper we choose u i in () because of its simplicity and applicability in realistic situations. Control u i can be rewritten as u i =K i [(h i d T i ) I n z +K i [(d T i L) I n z hi d =[K i K i ([ T i d T i L I n )z K i C i z () K i = [K i K i R mi n and [ hi d C i = T i d T i L I n R n nn. () Therefore denoting u p = K s z () K s = [(K C ) T (K N C N ) T T (6) the problem reduces to finding matrices{k K N } such that u p can still form a Nash equilibrium with the evader s strategy u e in (7). However according to Definition given performance indices () and () if the evader sticks to the Nash strategyu e in (7) and the pursuers fail to implement the corresponding Nash strategies in (9) the Nash equilibrium will no longer hold. Since it is generally not possible for the two expression of u p in (9) and () to be the same an intuitive approach is to modify the performance indices of both the evader and the pursuers keeping them as close as possible to the original indices given in () and () but in such a way that the evader strategy of (7) and pursuers strategy of () form a Nash equilibrium with respect to the modified performance indices. To achieve this goal we first find a family of performance indices with respect to which the pursuers strategy u p in () and evader s strategy u e in (7) form a Nash equilibrium. Then we use a new concept of best achievable performance indices to find a pair of performance indices within this family such that they are closest to the original performance indices J p and J e in () and (). Lemma : For the pursuit-evasion game described in ()- () given a set of matrices {K K N } then the strategies u e in (7) and u p in () form a Nash equilibrium with respect to performance indices J ps = z(t) M p + ( z Q ps u T p Sz zt S T u p + u p R p )dt (7) J es = z(t) M e + ( z Q es + u e R e )dt (8) S = B T p P p +R p K s (9) Q ps = Q p p Q es = Q e e p = P p B p Rp p P p Ks T R pk s () e = P e B p Rp S +S T Rp Bp T P e. () K s is as in (6) and P p and P e are the solutions of (8). Proof: Consider Lyapunov functions V p = zt P p z and V e = zt P e z () P p and P e are the solutions to the coupled Riccati equations (8b) (8a). Differentiating V p in () with respect to t and integrating it from to t f yield V p (t f ) V p () = [ z u Qps R p +u T psz +z T S T u p +z T P p B e (u e +Re BT e P ez)+ u p K s z R p dt. Hence J ps =V p ()+ u p K s z R p +z T P p B e (u e +Re BT e P ez)dt () J ps is defined in (7). Similarly we have J es =V e ()+ u e +Re BT e P ez R e +z T P e B i (u p K s z)dt () J es is defined in (8). Since R p and R e are positive definite it is obvious from () and () that u p = K s z and u e = Re Be T P e form a Nash e- quilibrium according to Definition. Clearly if K s = [(K C ) T (K N C N ) T T = Rp Bp T P p then J ps and J es in (7) and (8) become identical to () and (). We propose the concept of the best achievable performance indices as follows. Definition : Given system () and performance indices () and () performance indices (7) and (8) are called the best achievable performance indices if there exists a set of matrices {K K N } such that S p and e are minimized for all t S p and e are defined in (9)-() and is any user defined matrix norm. 7

Therefore by finding matrices {K K N } that correspond to the best achievable performance indices we actually find a pair of performance indices for the pursues and evader such that they are the closest to the original ones () and () while admitting strategies (7) and () as their Nash equilibrium. In order to determine {K K N } we need to solve a multi-objective optimization problem of minimizing S p and e simultaneously. One way to accomplish this is to minimize at every t a function that consists of a convex combination of these terms. That is H(t) =α S F +α p F +α e F =α Tr(S T S)+α Tr( p)+α Tr( e) () α i > and i= α i =. In () F denotes the Frobenius norm defined as M F = Tr(MT M). Note that the minimization of H(t) in () with respect to {K K N } is generally quite difficult to solve analytically. Since K s = N j= D jk j C j D i is an (m + +m N ) m i block matrix with the ith block being an m i m i identity matrix and other blocks zero matrices the gradient of H(t) with respect to K i can be obtained as Ki H =Di T R p(α S α K s p +α Rp Bp T P e e )Ci T i = N. (6) One possible approach to perform this minimization is to implement a gradient based algorithm such as steepest descent or conjugate gradient to search for the optimal matrices {K K N }. Note that these matrices will be independent of the initial state vector. By varying the coefficients α α a noninferior set can be generated and an appropriate choice of coefficients can be then made. In the game process we assume for simplicity that the pursuers sensing abilities are updated at discrete instants of time t t t β at which the pursuers perform sensing t = and t β = T. We assume that (t i t i ) is sufficiently small for all i = β such that the observations among the players can be regarded to be constant within such a small time interval (t i t i ). Hence once the information topology changes among the players at any of the time instants t t β the above algorithm will be performed to update the matrices K K N for the remainder of the game. V. EXAMPLE AND SIMULATION RESULTS Let us now consider a three-pursuer single-evader game taking place in a planar environment. Suppose that all players have the same dynamics given by () x i represents two position coordinates on the plane F = G i = I and u i represents the velocity control for i = e N. Then the system dynamics () can be written as A = B p = I I B e = I. (7) The performance indices are given by () and () with T = and M p = Q p = q(i I )R p = I I (8) M e = Q e = (I I )R e = I. q > is a scalar. The initial positions of pursuers and the evader are x = ( ) x = () x = () x e = (). We derive and simulate the results under various scenarios. In all of these scenarios we use α = / and α = α = / in (). We also assume a capture radius ǫ =.7. Scenario : If in addition to the evader all the pursuers also have wide enough sensing ranges (i.e. Λ = I and r i in () and ()) the game is under global information. Solving the coupled differential Riccati equations (8a) and (8b) with q = yields solutions P p and P e in the following form: P p = P p P p P p P p P p P p P p I P e = P e P e P e P e P e P e I P p P p P e P e P e Plots of P p P p P e and P e in this case are shown in Figure. And the pursuers and evader s strategies can be... P p P p P e P e 6 7 8 9 time Fig.. Plots of P p P p P e and P e obtained from (9) and (7). In this case the motion trajectories of the players and the distances between the pursuers and evader over time are shown in Figure. With a capture radius ǫ =.7 (indicated as a black line in Figure. ) the evader is captured around t =.. Scenario : Assume that the pursuers sensing ranges are now reduced to r i = for i =. As shown in Figure at t = pursuer can only observe the evader pursuer can observe the evader and pursers and pursuer can only observe pursuer. For q = the motion trajectories and the distances between the pursuers and evader over time in Figure. Clearly in this scenario the pursuers are not able to capture the evader (with the same capture radiusǫ =.7) when the final time T = is reached. The change in the topology of the game is reflected in the changes of Laplacian matrixl(t) and vectorh(t) in () and (). In this scenario L(t) has changed three times as follows: [ [ [ t.7.7 < t.6.6 < t 7

..8.6 Pursuer Pursuer Vertical axis. Pursuer Pursuer vertical axis...8.6.. Horizontal axis horizontal axis. Pursuer Pursuer 9 8 7 Pursuer Pursuer. 6... 6 7 8 9 6 7 8 9 Fig.. Motion trajectories and distances between the pursuers and evader under global information. distance to the evader is plotted in red dashed curve. Fig.. (-) () Vertical Axis () () Pursuer Pursuer Horizontal Axis Initial positions of three pursuers and single evader and the vectorhremained unchanged during the entire game i.e. h (t) = h (t) = and h (t) =. This essentially means while pursuers was never able to observe pursuer for the entire game it is able to observe pursuer for t [.6 and loses that observation after t =.7 Scenario : Using the same setting as scenario expect with q = (instead of q = ) motion trajectories and the distances between the pursuers and evader are shown in Figure. Clearly in this scenario (with ǫ =.7) the evader is captured around t =.. During the entire game the Laplacian matrix has also changed three times as follows [ [ [ t.. < t < t Fig.. Motion trajectories and distances between the pursuers and evader for scenario. and vector h has changed as follows: h (t) = h (t) = for t h (t) = for t. and h (t) = for. < t. In other words after t =. all the pursuers are able to observe the evader and after t = all the pursuers are also able to observe each other. Note that although scenarios and have the same performance indices the pursuers are not be able to catch the evader in scenario because of their limited observations. However as q increased from to between scenarios and the pursuers are able to catch the evader in scenario because of the higher penalty on the distances in the pursuers performance index. Remark: It would be interesting to determine a critical value q c of q which separates the escape and capture regions of the evader. That is if q < q c the evader escapes and if q q c the evader is captured at a time t [T. For this game the critical value of q has been determined to be q c =.6. Figure 6 shows plots of the distances between the pursuers and evader when q = q c =.6. Clearly the capture time in this case occurs at t =. Figure 7 shows a plot of the capture time versus q. Clearly as q increases from the critical value the capture time will decrease. The region < q < q c =.6 is the evader escapes i.e. the capture time will be. 7

vertical axis...8.6.. Pursuer Pursuer horizontal axis..... Pursuer Pursuer 6 7 8 9 Fig.. Motion trajectories and distances between the pursuers and evader for scenario...... Pursuer Pursuer 6 7 8 9 Fig. 6. s between the pursuers and evader when q = q c =.6 Capture 8 6 escapes q c =.6 is captured 6 8 6 8 q VI. CONCLUSION In this paper closed-loop Nash strategies for a linear N-pursuer single-evader differential game with quadratic performance indices over a finite time horizon is considered. The evader is assumed to have unlimited observations while each pursuer has limited observations based on its own sensing radius. Since the evader is able to observe the states of all pursuers at all times it implements a standard closed-loop Nash strategy using the well-known couple Riccati differential equations approach. Under the limited observations setting however each pursuer is able to observe the states of only those players who are within its sensing radius. Hence the pursuers will seek to implement a collective strategy that can form a Nash equilibrium with the evader s strategy. To accomplish this the pursuers derive best achievable performance indices in such a way to be as close as possible to the original performance indices and such that the resulting pursuers Nash strategy based on the available observations and the original evader s strategy form a Nash equilibrium. An illustrative example involving three pursuers and one evader is solved and simulation results corresponding to different scenarios are presented. REFERENCES [ R. Isaacs Differential Games. John Wiley and Sons 96. [ Y. Ho A. B. Jr. and S. Baron Differential games and optimal pursuitevasion strategies IEEE Trans. Automatic Control vol. AC- no. pp. 8 89 96. [ A. Starr and Y. Ho Nonzero-sum differential games Journal of Optimization Theory and Applications vol. pp. 8 6 969. [ M. Simaan and J. Cruz On the Stackelberg solution in the nonzerosum games Journal of Optimization Theory and Applications vol. 8 pp. 97. [ M. Foley and W. Schmitendorf A class of differential games with two pursuers versus one evader IEEE Trans. Automatic Control vol. 9 pp. 9 97. [6 S. Bopardikar F.Bullo and J. Hespanha On discrete-time pursuitevasion games with sensing limitations IEEE Trans. Robotics vol. no. 6 pp. 9 9 8. [7 S. Bhattacharya and T. Baar Differential game-theoretic approach to a spatial jamming problem in Advances in Dynamic Games ser. Annals of the International Society of Dynamic Games P. Cardaliaguet and R. Cressman Eds. Birkhuser Boston vol. pp. 68. [8 D. Li and J. Cruz Defending an asset: A linear quadratic game approach IEEE Trans. Aerospace and Electronic Systems vol. 7 no. pp. 6 april. [9 H. Huang W. Zhang J. Ding D. Stipanovic and C. Tomlin Guaranteed decentralized pursuit-evasion in the plane with multiple pursuers in th IEEE CDC pp. 8 8. [ J. Guo G. Yan and Z. Lin Local control strategy for moving-targetenclosing under dynamically changing network topology Systems & Control Letters vol. 9 pp. 6 66. [ M. Wei G. Chen J. B. Cruz L. S. Haynes K. Pham and E. Blasch Multi-pursuer multi-evader pursuit-evasion games with jamming confrontation AIAA J. of Aerospace Computing Information and Comm vol. no. pp. 69 76 7. [ W. Lin Z. Qu and M. Simaan A design of distributed nonzero-sum nash strategies in 9th IEEE CDC Atlanta USA pp. 6 6. [ R. Olfati-Saber and R. Murray Consensus problems in networks of agents with switching topology and time-delays IEEE Trans. Automatic Control vol. 9 no. 9 pp.. Fig. 7. Capture versus q 76